Vim, LaTeX, awk and Gnuplot
This document is just a draft, and I don't know how likely it is that I'll ever
finish it. I find it hugely unlikely that anyone will ever find any of it
useful, but if you do, or if you see any errors, feel free to tell me so at
drichfld@freeshell.org.
The Vim-LaTeX suite
I used Vim to edit my thesis, which was in LaTeX.
The Vim-LaTeX
suite makes that a lot easier. After installing the suite, go check out the
documentation: typing :h suite
will give you an amazing amount of
information.
Not only does it have syntax folding and highlighting, it knows how to direct
the LaTeX compiler, even doing multiple passes when BibTeX is involved. It can
also be made to understand what to do when a Makefile is present. A good tip if
you don't want to use a Makefile, but have your chapters in separate .tex
files, is to create a file called MainName.latexmain in your project directory.
Then even if you're editing ChapWetChem.tex, the LaTeX-suite knows to compile
MainName.tex.
Oh, and never call your chapters "chapter1.tex" etc. Who knows when you'll
want to add one in? This also makes it easier to insert LaTeX references.
It's much easier to remember to type
...as explained in chapter~\ref{ChapMaths}, $2+2=5$ for large values of 2.
than remembering that what used to be Chapter2.tex is now Chapter3.tex, and
it's much easier to read, too.
A really nifty trick is to enable the "-src-specials' option in your texrc file
(I do this in my user directory:
/home/david/.vim/ftplugin/tex/texrc
). This allows you to go to a
place in your file and type \ls
, and you are taken to the dvi, at
the place where you were in your source. This is already much cooler than the
cat's whiskers, but it gets better. Also enable the
g:Tex_UseEditorSettingInDVIViewer
option, and you can Ctrl-click
in your dvi window, and you will be taken to the place in your code that
corresponds to that ouput element (be it text, graphics or even a section, if
you click on the header line)!
The Vim-LaTeX suite also understands BibTeX quite nicely, so if you have a .bib
file set up for your project that is referenced in your file, the master file
for the project, or any file you're including from your current file, you're
set to do F9-completion: Type \cite{bla[F9]
and you will get a window with a
list of all your citations whose keys start with "bla".
I made all my figures both as .ps and as .pdf files. The reason for this
apparent silliness is that a dvi compile is quicker than making a pdf, but
good old pdflatex doesn't insert postscript pictures. It can put in bitmaps of
all weird and wonderful types, which normal LaTeX can't do, but ps is a
mystery to it.
I started out with \ifpdf
commands in each figure, but soon realized that there
is an easier way: If each figure is in both forms, one with a .pdf extension,
and one with .ps, all you have to do is:
\includegraphics[width=\textwidth]{graphs/GrowthGraph}
and LaTeX will use the ps, and pdflatex will use the pdf.
Gnuplot
I used gnuplot's curve fitting quite a lot. One thing that you won't find
there is the infamous r-squared, or
Coefficient
of determination. This isn't much of a problem. If you really need an
R2 value, and don't want to use the more technical goodness of fit
tools that gnuplot gives you, there are plenty of tools that will do that for
you.
If you've ever seen documents with pasted-in bitmaps of graphs, you'll appreciate the quality that vector graphics can give to your graphs. the postscript terminal on gnuplot is quite nice for this. however, I've found that latex has poor support for rotated ps files, so I ended up doing all my graphs in "portrait" orientation, and cropping them by hand.
My .gpi files had the following commands:
set term postscript enhanced portrait 10
set output 'Parameters.ps'
set size ratio 0.7
Then I would end up with a .ps file with a graph at the bottom of the page.
This could be fixed by changing the line
%%BoundingBox: 50 50 554 770
to fit the graph, in this case to
%%BoundingBox: 50 50 554 400
The new coordinates are easy to find with gv, which continuously reports the
coordinates of the mouse. This trick is also nice if Gnuplot crops a title or
label (this sometimes happens when you're using special characters).
In the case of the pdf file, you'll find the cooordinates after the word
"MediaBox". However, after fixing the file, if you changed the number of
characters in the MediaBox statement, you'll have to run the file through
pdftk:
pdftk broken.pdf output fixed.pdf
or all the programs that have to work with the pdf after your hack will
moan and complain.
Chromquest and awk
Thermo Scientific
make a range of really good value-for-money HPLC instruments, and in general,
the ThermoQuest software is quite user-friendly. I used version 2.51, which is
apparently horrendously out of date, and I found it to be reasonably
feature-rich, if slightly buggy. The one thing that I could not make the
software do was to export 3D data from the diode-array-detector. But no fear,
brute force will always get you there. I grabbed the .dat file that the
chromatography data was dumped into, and did the following:
hexdump -C chrom.dat > chrom.hexdump
vim chrom.hexdump
Now I went about halfway down the file, because I figured the 3D data would
take up most of the file's bulk. I then searched backwards for the first
occurrence of a duplicate line, which hexdump gives as an asterisk:
?^*$
and what I found was:
00100640 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
001007a0 00 00 00 00 c4 fe ff ff 66 00 00 00 31 01 00 00 |....Äþÿÿf...1...|
001007b0 df fe ff ff a4 00 00 00 be 01 00 00 ea ff ff ff |ßþÿÿ¤...¾...êÿÿÿ|
001007c0 8e ff ff ff 07 00 00 00 b5 fe ff ff 76 00 00 00 |.ÿÿÿ....µþÿÿv...|
Of course, you might be unlucky, and have a duplicate line halfway through your
data. To check for this, you can look upwards through your data a bit. In my
files (YMMV), I found a long section of data before my 3D block that just had
increasing numbers, like this:
001005a0 e9 0e 00 00 ea 0e 00 00 eb 0e 00 00 ec 0e 00 00 |é...ê...ë...ì...|
001005b0 ed 0e 00 00 ee 0e 00 00 ef 0e 00 00 f0 0e 00 00 |í...î...ï...ð...|
001005c0 f1 0e 00 00 f2 0e 00 00 f3 0e 00 00 f4 0e 00 00 |ñ...ò...ó...ô...|
001005d0 f5 0e 00 00 f6 0e 00 00 f7 0e 00 00 f8 0e 00 00 |õ...ö...÷...ø...|
After some fiddling around, I discovered that the data was arranged as
four-byte numbers, with the most significant bytes last. For example, the
number "c4 fe ff ff" above is actually 0xFFFFFEC4, or -315.
Likewise, at the end of your data, you will see some text, for example:
001d5e00 32 32 30 20 6e 6d 00 00 80 3f 03 6d 41 55 6f 12 |220 nm...?.mAUo.|
001d5e10 83 3a 00 00 00 80 3f 00 00 16 45 00 00 00 00 01 |.:....?...E.....|
001d5e20 00 00 00 17 00 00 00 00 00 80 3f 01 00 00 00 00 |..........?.....|
001d5e30 00 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 |................|
001d5e40 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 |................|
001d5e50 01 00 00 00 01 00 ff ff 00 00 0d 00 43 44 65 74 |......ÿÿ....CDet|
001d5e60 54 72 61 63 65 49 6e 66 6f 01 00 00 00 05 00 00 |TraceInfo.......|
So now there is the task of changing the data into a 3d chromatogram. Note the line numbers of the start and end of your 3d data in the hexdump. In my case, 9495 and 64119. Now we fire up awk:
awk '{
if (9494 < FNR && FNR < 64120) {
print "0x"$5 $4 $3 $2
"\n0x"$9 $8 $7 $6
"\n0x"$13 $12 $11 $10
"\n0x"$17 $16 $15 $14
}
}' < chrom.hexdump |
awk --non-decimal-data '/0xf/{
print NR " " $0 - 0xffffffff}
/0x[^f]/{print NR " " $0 + 0}' > chrom.3d.num
This gives you the 3D data in one line-numbered column. Just the way gnuplot
likes it for one-dimensional display. Now you can open gnuplot and type
"plot chrom.3d.num with lines"
You will see your chromatogram
with a strange, filled-in look. now zoom in on a nice large peak, and you will
see your chromatogram as a series of UV spectra! Now you have to figure out
how many points there are per spectrum. Zoom in nice and big, and double-click
on the first point of one spectrum. this has just copied the coordinates of that point into the clipboard. Now go paste into vim or wherever. Next, double-click on the last point of that spectrum and paste that.
If you zoomed in far enough, you should get the line numbers quite precisely,
for example 116573.1 and 116663.0. Now you have a data point (in the sense of a
3D chromatogram) starting at 116573 and ending at 116663 (for example). Now
subtract those two numbers to get the width of your chromatogram in data
points, in this case getting 90.
Now we can fire up awk again:
awk '{for (i=0; i<90; i++) {
printf "%d ",$2; getline
}
print ""}' <chrom.3d.num > chrom.3d.matrix
And now the gnuplot command
splot 'chrom.3d.matrix' matrix w lines palette
gives a surface plot of the 3d chromatogram. If your chromatogram looks like
the low wavelengths have been lopped off and stuck at the end of the high
wavelengths, you have some junk at the start of chrom.3d.num to remove.