UnRTF also supports LaTeX and ASCII plain text output. Use UnRTF to convert RTF files to HTML files. Use djvutoxml command from DjVuLibre library ( ) to convert DjVu to XML. Use pstopdf command to convert PostScript to PDF. Please refer to LibreOffice documentation for details: On Windows command line, the convert-to parameter uses only one dash. The output file will be named input_file.TargetFileExtension. Note that the square brackets around : mean that this part is optional. If you have LibreOffice installed on your system, you can run soffice command in headless mode to convert documents: $ soffice -headless -convert-to input_file.xxx Use cupsfilter command to convert TXT to PDF and HTML to PDF. Use textutil command to convert among txt, rtf, rtfd, html, doc, docx, odt, wordml, and webarchive formats. Specfic Document Format Conversions Mac OS X Note that pandoc supports the newer XML-based docx MS Word format but not the older OLE-based doc MS Word format. Use pandoc command to convert amongst popular markup formats: Use paps command ( ) to format UTF-8 plain text files. Unfortunately, enscript does not support UTF-8 encoding. Use enscript command ( ) to convert text files to PostScript, HTML, and RTF. Use cupsfilter command to convert non-PDF formats to PDF. textutil is based on the Cocoa Framework, so it isn't available on Linux. The -info option extracts basic metadata from files of these formats. Use textutil command to convert plain text to rtf, rtfd, html, doc, docx, odt, wordml, and webarchive formats. Use xml_grep to extract text from XML document: xml_grep example.xml -text_onlyĮxtract text only from mytag tag: xml_grep 'mytag' example.xml -text_only Use the djvutxt command to extract text from DjVu, assuming a text layer exsits. DjVuĭjVuLibre ( ), an open source DjVu library and viewer, comes with a suite of command line utilities. Use html2text command ( ) to extract text from HTML file. Use pdftotext command to extract text from PDF file, assuming a text layer exists. Poppler library ( ), based on Xpdf, comes with a suite of PDF tools. For a list of supported encodings run $ iconv -l The -c option discards unconvertible characters, and pointy brackets denote required options. The basic usage is $ iconv -c -f -t input.txt > output.txt Use iconv command to convert plain text from one encoding to another. For image files make sure you have ImageMagick installed, then use identify command to extract image metadata. Use file command to obtain basic metadata for most file formats. Everyone should embrace the mantra "plain text is beautiful". Distribute documents as plain text using UTF-8 encoding whenever possible.
The following example shows processes for the user linuxhint as an effective user.This document outlines some ideas for document conversion on Linux and Mac OS X platforms using command line tools. Note: If you don’t know what effective and real users are, the explanation is at the end of this section. This will show the effective user, whose permissions are used to run the process, but not the user who called the process (Real User). If you want to check only processes executed by a specific user (Effective User), you can use -u flag explained previously without additional flags, followed by the username whose processes you want to list. Showing a Specific User Processes Using ps: l is multi-threaded (using CLONE_THREAD, like NPTL pthreads do).L has pages locked into memory (for real-time and custom IO).Z defunct (“zombie”) process, terminated but not reaped by its parent.t stopped by debugger during the tracing.S interruptible sleep (waiting for an event to complete).Possible code stats explained in ps man page are: STAT: The column stats show code states for the process. Ps2pdf is a simple wrapper around ghostscript ( gs. Further details can be found on the ps2pdf manual page. This works fine when only one PostScript file has to be converted. TIME: CPU usage of process or thread, incremented each time the system clock ticks and the process or thread is found to be runningĬOMMAND: This is the same as the previously explained CMD column. The simplest way to convert PostScript files into PDF on our Linux machines is to use the ps2pdf command, e.g. START: This column shows when the process started. Shows the memory occupied by a process in the ram memory (not in swap). VSZ: Shows the virtual memory used by the process. If you want to check memory use by process, you can read How to Check Memory Usage Per Process on Linux. This column isn’t recommendable for users to check the memory use because the used memory amount isn’t exact. %MEM: This column shows the RSS (Resident set size) divided by the used memory. %CPU: This column displays the calculation of time used by the process divided by the time the process is in execution. USER: shows the effective user, whose permissions are used to run the process. The columns display the following information: