File utilities
ddsplit splits a large file into smaller pieces Jun 30 1999
debill removes annoying ^MS from DOS text files Aug 24 1999
oldtouch synchronizes modification times of directories with files and of files with svn info May 20 2009
substitute search and replace in text files Jul 8 2008
txt2mac converts a Unix or MS text file to Mac text format Jun 5 2004
XML, LaTeX, Postscript
escapechars converts non-ASCII (UTF-8, Latin-1 etc.) files to ASCII with XML or TeX escape sequences Jun 18 2005
latex2utf8txt converts LaTeX files to UTF-8 text, removes line breaks from paragraphs Jun 18 2005
moveps translates coordinates in PostScript files Jan 03 2005
stripkml removes superfluous placemarks from GPS tracks in Google Earth KML files Mar 16 2008
tex2l converts LaTeX escape sequences to ISO8859-2 (Latin-2) encoded characters May 29 2006
texepslist lists EPS filenames occurring in LaTeX file(s) Jan 14 2009
DVD burning
dvdburn writes an ISO image to DVD using growisofs or cdrecord Jan 18 2006
id3ls lists MP3 ID3 info using id3, id3tool, id3ed or id3v2 May 6 2006
ram2wav converts RealAudio RAM to WAV Apr 21 2008
System, Network
lpready lists printers and enables disabled printers Feb 8 2008
pping ping script that warns for packet dropping Dec 4 2008
ppwgen generates a password May 14 2008
rpmlist prints sorted list of installed software packages Nov 9 2008
sdirdiff, sgetdiff, sputdiff remote directory synchronization with ssh Oct 5 2008
sreget, sreput downloads a file with ssh starting at end of local file Oct 19 2006
Error checking
pcsums.c calculates BSD, SysV, CRC-32 checksums, stops on I/O error Dec 23 2005
sysvsums calculates System V checksums for all files recursively Jun 5 2004
Startup files
~/.vimrc for VI iMproved (vim/gvim) Jul 22 2009
~/.zshrc my Z shell startup file Jun 16 2009
(Probably) obsolete
freget downloads a file with ftp starting at end of local file Nov 9 1999
pscat concatenates PostScript files Dec 13 2008
rpmupdate updates your Linux from a directory containing RPM files May 9 2002
unbot removes robots from access log files Oct 9 2007


Splits a large file into smaller pieces.
The chunk size is specified in kilobytes.
ddsplit mozilla-5.0-M6.i386.rpm 1423 mozilla dd if=mozilla-5.0-M6.i386.rpm of=mozilla.000 bs=1024 count=1423 skip=0 1423+0 records in 1423+0 records out dd if=mozilla-5.0-M6.i386.rpm of=mozilla.001 bs=1024 count=1423 skip=1423 1423+0 records in 1423+0 records out dd if=mozilla-5.0-M6.i386.rpm of=mozilla.002 bs=1024 count=1423 skip="2"846 1423+0 records in 1423+0 records out dd if=mozilla-5.0-M6.i386.rpm of=mozilla.003 bs=1024 skip=4269 288+1 records in 288+1 records out ls -l mozilla* -rw-r--r-- 1 cspeter cspeter 4667180 Jun 19 17:50 mozilla-5.0-M6.i386.rpm -rw-r--r-- 1 cspeter cspeter 1457152 Jun 30 12:13 mozilla.000 -rw-r--r-- 1 cspeter cspeter 1457152 Jun 30 12:13 mozilla.001 -rw-r--r-- 1 cspeter cspeter 1457152 Jun 30 12:13 mozilla.002 -rw-r--r-- 1 cspeter cspeter 295724 Jun 30 12:13 mozilla.003
The splitted files can be put together again simply by using cat:
cat mozilla.0* >mozilla.rpm
Tiny sh script that converts WinDOS text files to Unix.
debill "TEXT F~1.TXT" >text_file
Synchronizes modification times of directories with files and of files with svn info.

sh script.

oldtouch foo anotherdir
Search and replace in text files.

Perl script that searches the specified files for a pattern, and if found, replaces that pattern with the replacement text. File modification times are preserved if the -p option is specified as the first argument.

substitute '#e0e0e0' '#ccffcc' marvin/*.html
Using the -m i or -m x option, a multiline block can also be replaced by the contents of a file. The first non-option argument is the pattern that must match the first line of the block. The second argument is the pattern matching the last line of the block (-m i) or the following line (-m x). The third argument is the file containing the replacement.
substitute -m x "^...This library is free software; you can redistribute it and/or" "^package " new_license.txt *.java
If the -N option (-1, -2 etc.) is used, then replacing is only performed in the first N lines of the file(s).

Perl's substitution operator can also be used directly. To replace text strings like "c=0.01 psi0=0.3" by "lambda=0.01 sigma=0.3", use

substitute 's/c=(.*)psi0=/lambda=$1sigma=/' *.in
Tiny Perl script that converts a Unix or MS text file to Mac text format.
txt2mac file.txt >mac_text_file
Perl script that converts non-ASCII (UTF-8, Latin-1 etc.) files to ASCII with escape sequences.
escapechars utf8 xml file.utf8.html > file.html        # use ampersand unicode escape sequences for XML

escapechars latin2 tex file.latin2.tex > file.tex      # use TeX escape sequences like {\'a}, {\H o} etc.
Perl script that converts LaTeX files to UTF-8 text. Also removes line breaks from paragraphs.
latex2utf8txt file.tex
Perl script that translates the BoundingBox and the arguments of all moveto commands in a PostScript file.
moveps -508 0 rk4-r2.eps > changed.eps
Simplifies Google Earth KML files by removing superfluous placemarks ("Points") from GPS tracks. Only the Path is left.
Perl script based on LibXML.
stripkml pakistan.kml > pakistan.stripped.kml
sh script that uses sed to converts escape sequences in LaTeX files to ISO8859-2 (Latin-2) encoded characters. For the inverse conversion, use escapechars latin2 tex.
Perl script that lists PostScript and EPS filenames occurring in LaTeX file(s).
sh script that uses growisofs or cdrecord to write an ISO image to DVD.
Note that you might have to change the dev and driveropts options in the script before using it first.
dvdburn stuff.iso
cdrecord -v -dao speed="2" dev=/dev/dvdrecorder driveropts=burnfree -isosize stuff.iso
Proceed? (yes/no) yes
sh script that tries to use id3, id3tool, id3ed or id3v2 to list the MP3 ID3 info in the current directory or for the specified files.
The format is uniform, it does not matter which program is used.
sh script that converts RealAudio RAM to WAV using wget and MPlayer.


	ram2wav 7
sh script that lists available printers in /etc/printcap and enables disabled (probably USB) printers.
Perl script that executes ping and prints a warning message immediately when a packet is dropped.
Perl script that generates an 8-character password containing random alphanumeric characters. The generated password contains at least 2 digits, 2 lower case and 2 upper case letters.
sh script that lists installed RPM packages in installation time, name or size order.
sdirdiff, sgetdiff, sputdiff
sh scripts for directory synchonization.

sdirdiff shows the difference between a local and a remote directory. Uses ssh to log in to the remote host, and sum to determine if a local and a remote file have the same contents.
Native executables, ar archives, object, dependency info (*.d), LaTeX auxiliary, and backup (*~ *.bak *.BAK) files are skipped. (Native binaries are recognized using the file command and looking for the "ELF " or "COFF " string in its output.)

sdirdiff sunserv
< ddsplit
< index.html
> index.html
Furthermore, option -x lets you skip files with the specified extensions:
sdirdiff -x ".html .dat .whatever" sunserv
< ddsplit
sputdiff uploads all the files from the CWD to the remote host that are missing, or their remote version have different contents. Uses tar and gzip to package and extract the difference, and ssh to copy the difference to the remote host.
sputdiff sunserv
x ddsplit, 799 bytes, 2 tape blocks
x index.html, 4416 bytes, 9 tape blocks
sgetdiff does the opposite: downloads the difference.

All the three programs are contained in one script file. To use them, download sdirdiff, make it executable, then make two hard links:

chmod +x sdirdiff
ln sdirdiff sgetdiff
ln sdirdiff sputdiff
Alternatively, you can download sdirdiff.tar.gz which contains the three hard linked executables.
Continues downloading of a partially downloaded file.

Copies the end of the remote file to the local host using ssh, then appends it to the local file so that the two files will have the same contents. To speed up downloading, it compresses the data with gzip while copying.

I use it to download only the changed parts (the end) of a large logfile through a slow network connection (modem):

sreget sunserv:log/access_log
0+1 records in
51+1 records out
Also works if the local file is compressed with gzip or bzip2:
sreget sunserv:log/access_log cspeter.log.bz2
bunzipping: cspeter.log.bz2 -> cspeter.log
0+1 records in
0+1 records out
bzipping: cspeter.log -> cspeter.log.bz2
Inverse of sreget.

Copies the end of the local file to the remote host using ssh, then appends it to the local file so that the two files will have the same contents. To speed up downloading, it compresses the data with gzip while copying.

sreput 264-10.bdata dali:data
0+1 records in
3456+1 records out
1769490 bytes (1.8 MB) copied, 0.156672 seconds, 11.3 MB/s
Checksum calculator written in C.

Differences between GNU sum 2.0 and pcsums:

Useful to determine the location of errors on CDROMs or floppy disks.
pcsums /dev/fd0 /dev/cdrom /boot/vmlinux-* /var/lib/rpm/packages.rpm
       M           k   BSD  SysV   CRC-32
     1.4      1395.0 46789 52579 3F99108D /dev/fd0
/dev/fd0: Input/output error
   549.5    562674.0 20762  7470 4DF3BA6F /dev/cdrom
/dev/cdrom: Input/output error
     1.5      1508.2 42095 41065 B5C1A3C2 vmlinux-2.2.12-20
    18.2     18628.6 33476 50796 9460293E packages.rpm
Compilation: cc -O pcsums.c -o pcsums
Perl script that prints the file sizes and the System V checksums for all files in the current directory recursively.
      806 59248 bashrc
      792 58067 ddsplit
       34 02175 debill
      918 04373 dvdburn
      709 57709 eftp
      844 55721 fix-accesslog
     2700 09993 freget
    14857 28863 index.html
      398 33218 inputrc
      468 40072 l2tex
      131 11888 Makefile
     4819 15344 pcsums.c
     1421 38209 pscat
     5108 60253 Rename.class
     7378 58557
     1176 16144 rpmupdate
     2682 00800 sdirdiff
     1283 37048 sdirdiff.tar.gz
     2682 00800 sgetdiff
     2682 00800 sputdiff
     2030 24185 sreget
      963 08028 substitute
      842 58194 sysvsums
      646 53427 tex2l
     4852 00893 unbot
     4654 62536 vimrc
    18983 57653 zshrc
~/.vimrc and gvimrc
Startup files for VI iMproved.

Automatic indentation and/or syntax highlighting for C, C++, Perl, Java, JavaScript, JSP, shell scripts, HTML, LaTeX, etc. Editing of gzip- and bzip2-compressed files.
Extra keybindings: Ctrl-J (paragraph formatting like in pico), F6 (next file), Alt-F6 (previous file), F9 (make), Alt-F9 (build all), F10 (replace spaces by tabs). In LaTeX mode: F9 (latex && dvips), Alt-D (xdvi), Alt-V (gv or ghostview).

Startup file for the Z Shell.

The xterm title contains the last command (in zsh >= 3.1).
F1: man for the current command, F9: smart make (also works if you are deeper than the makefile's directory), l: ls -ltr, p: popd, zless: less -i for compressed files (.Z, .gz, .bz2). If vim exists, vi is its alias and v is gvim -R.
Java-related functions: jdk11 and jdk12 to switch from one JDK version to another by changing PATH and CLASSPATH (note that JDK12HOME, JDK11HOME and SWING_JAR must be set correctly for these functions to work).

In the Bourne-Again Shell (versions 1.x-2.0x), most of these features cannot be implemented. The maximum what I could do is in bashrc (~/.bashrc) and inputrc (~/.inputrc).

Continues downloading of a partially downloaded file.

Simple ftp reget, written in Perl. Requires the Net::FTP module.
Its extra feature is that it works also if the local file is compressed (gzip or bzip2) but the remote is not.
I use it instead of sreget on a host that has no ssh daemon.

freget chemaxon.orig.bz2
bunzipping: chemaxon.orig.bz2 -> chemaxon.orig
opening ftp connection to
cd log
get access_log chemaxon.orig 11444158
bzipping: chemaxon.orig -> chemaxon.orig.bz2
Concatenates PostScript files.

Perl script that merges two or more PostScript files and prints the result to stdout.

pscat > pi+p,
Warning! This script is a simple hack, use at your own risk. In some (maybe most?) cases, gs is better:
gs -q -dNOPAUSE -dBATCH -sOutputFile=pi+p, -sDEVICE=pswrite
Perl script that loops through all *.rpm files in the current directory and installs those packages that already have an older installed in your Linux system. (This script is obsolete!)
Removes lines associated to robots and spiders from web server access log files.

Written in Perl.
The logfile can be compressed (gzip or bzip2). Example:

unbot cspeter.log.bz2 -b robot.bz2 | bzip2 >human.bz2