Here is a list of problems I found in LATEX2HTML 99.2 and corresponding FU fixes.
Reason: DOS has only very marginal redirection capabilities. pstoimg.bat does three redirections to reduce screen output. Each of them by itself is enough to crash the DOS box.
Solution: I removed all three. My l2hcrop does less output anyway.
Reason: Perl does not pass the parameters to batch files. LATEX cannot process images.tex if it does not know the file name.
Solution: Create a Gnu 77 Fortran program rb.exe, (for run bat), and run all batch files through that executable.
Reason: DOS.
Solution: Make a batch file, latexl2h.bat, that returns a pointer,
C:\Temp\rb.err
, if an error occurs, and then make rb.exe
return an error status to Perl.
Reason: DOS boxes do not have a buffer, and output redirection of stderr would be very difficult even without all the other redirection already there.
Solution: Create a log file C:\Temp\l2h.log
and copy
all the output, of at least in latex2html.bat, pstoimg.bat, and texexpand.bat,
to that file. Allow an environment variable L2H_SLEEP
to be
set with a delay. (Note that reading from the keyboard tends to fail
with all the input redirection being done.)
Reason: It tries to accomodate every possible version and variation of about 30 different external utilities not under the authors' control.
Solution: Specify specific version numbers for the netpbm utilities in a constant state of flux. Ignore alternate possibilities to freely available programs, and specify the latest version of the more consistent ghostscript, dvips, and perl utilities.
Reason: latex2html was written a long time ago.
Solution: Switch to the earliest I could find, Netpbm 10.6.
Reason: It is an old version.
Solution: Get those from the last version, 10.18.
Reason: Netpbm was not really designed for Windows.
Solution: Substitute a batch file as a crude fix.
Reason: Seems to be an indication that they are obsolete.
Solution: Rename them.
Reason: A known bug in PPMQUANT.
Solution: Substitute a batch file that runs pnmmap and pnmremap instead.
Reason: DOS cannot keep up with the amount of files being renamed.
Solution: On failure, put in a one second delay, and then try again.
Reason: Beats me.
Solution: Provide a fake one, chckhtml.bat, doing nothing. You can always put one in there, following the lines of latexl2h.bat.
Reason: Beats me.
Solution: Change prefs.pm to get rid of png completely.
Reason: Whatever you set address to, it is redefined in latex2html.bat.
Solution: Change latex2html to not set address if it is already defined.
Reason: The authors did not put it in.
Solution: Put suitable switches and parameters in the pin files.
Reason: Page color is set to grey scale in l2hconf.
Solution: Set page color to the more general white.
Reason: A new line after a comment symbol.
Solution: Remove the new line.
Reason: An insane idea to have the icons placed on some central location on the web server, rather than safely with each document. You have to be a system administrator yourself to trust they will keep these icons safe for eternity. If these system administrators are still working here next year, and still remember what the icons were for in the first place, and are willing to figure out whether someone is still using them instead of simply deleting and waiting for complaints, that is.
Solution: Set the local icons option as default.
Reason: For historical reasons, it says.
Solution: Use HTML 4.
Reason: Beats me.
Solution: Set default options to enable bottom panels on all pages.
Reason: Putting section names between the icons causes them to flip back and forth from one page to the next. With all the text, there is not enough space for full titles.
Solution: Get rid of the text except that of the upcoming section, and put that behind the icons. Put them in an intuitive order. Move the navigation panel formatting to .latex2html-init to make it easier to customize.
Reason: Beats me.
Solution: Remade the icons. Still not very impressive, but better. Added some support for people to make their own.
Reason: Authors do not use MS Paint?
Solution: Add bmptopnm.exe from netpbm and create a batch file bmptoeps.bat.
Reason: Not designed that way.
Solution: Provide a batch file makel2h.bat to do the usage overhead and a utility l2hnewer.exe to check file dates.
Reasons: Use of special style files, some with file names over 8 characters long. Use of a makefile, which simply does not work, for one reason since DOS batch files do not return error codes.
Addition: the two poorly documented makefiles do not work under Unix, either. Half the manual is not processed.
The official online manual is not that great either. For example, what to make from figure 3?
Solution: Provide the processed manual in pdf form, and the hthtml page in html form and just zip the rest.
Reason: No acount is taken of older TeX versions.
Solution: Make a note to the users to rename it.
Reason: Beats me.
Solution: Make sure none is missing.
Reason: The font sizes are not fixed, depending on options and user browser setting.
Solution: If you represent math by images, you no longer have a choice about font size. So, set the chosen font size for the nonmath to the same fixed size using style sheets. (Note however that web page readers may set their browser to ignore style sheet font sizes. That is up to them. Their math will still be the same size.)
Reasons: Mozilla and Internet Explorer do not vertically align images at the same height. Also, for both browsers, image alignment depends on font size. Raised images are not correctly aligned by the latex2html pnmcrop procedure. If there is more than one image in the same line, Internet Explorer will mess up the alignment of one because of the presence of the other.
Solutions: Provide separate web pages for Mozilla and Internet Explorer. Provide an older version of pnmcrop instead of the latest which does not work correctly. Provide a program, l2hcrop.exe, that crops more reliably. Fix the font size, and make l2hcrop compute the correct alignment for that font size. L2hcrop will correctly crop raised images. The IE problem with multiple images in the same line cannot be resolved, but l2hcrop minimizes it by cropping the mages to the minimum size required.
Reasons: Sometimes latex2html produces alignment bars that are too short. In that case, pnmcrop will leave this artifact in the image. In addition, pnmcrop may be a wrong, too recent, version.
Solution: Use l2hcrop, and make it check and correct for the possibility that the alignment bars are too short. Ensure a compatible version of pnmcrop for the rare case that l2hcrop runs out of memory.
Reason: The whitespace put in to vertically align images.
Solution: Let l2hcrop cut the images to the smallest size possible to minimize the problem.
Reason: Ghostscript does not anti-alias embedded bitmaps.
Solution: Apropriate the extrascale option, ignored by latex2html for figures, to let ghostscript oversize the figures, then provide l2hcomp.exe to put them back to size with proper anti-aliasing.
Reason: Apparently the authors want to simulate mathsurround, even if it just looks funny on a screen with much less real estate than a printed page.
Solution: Class the whitespace, and use style sheets to get rid of it.
Reason: Icons are hard coded, even with their links and file names.
Solution: None.
Reason: Seems to be intentional, probably a bug fix for something else.
Solution: Tell users to use {\lq}
and {\rq}
.
Reason: Well, what do you expect.
Solution: L2hcrop provides a partial solution, because one invocation of l2hcrop replaces about four invocations of pnmcrop, one of pnmquant, as well potentially a pnmflip one. Also, l2hcrop reads the image file only once, keeping it in memory. The Quantum Mechanics for Engineers web pages, with well over 500 images, take 15 minutes to create, per browser, on my four or five year old Windows 98 laptop.
I do find that it runs many times more slowly (though correctly) on my desktop at work, also a 98, but with 128 Mb memory instead of 256, and slower disks. Neither disk or memory is the problem, however. The problem is that the processor is maxed out to 100% while running. It however, is virtually the same processor as my laptop's. If anyone can solve this complete mystery for me..?
Reason: The algorithm ignores changes in spaces, as well as missing backslashes and brackets that you correct.
Solution: Warn the user to delete the image. Also note to try browser refresh.
Reason: They are named by process ID. DOS reuses process IDs.
Solution: Use time instead of process id for the name.
Reason: I don't know.
Solution: None.
Reason: They are all converted into classless H1 tags, because really important divisions such as subparagraph need their own tag, and html has only 6.
Solution: Let subparagraph suffer.
Reason: Beats me.
Solution: Let makel2h move in a style file before latex2html gets the chance.
Reason: ppmtogif crashes trying to interlace gifs that are only 3 pixels or so high.
Solution: Let l2hcrop return a warning if the image is less than 16 pixels high, and turn off interlace if it is.
Reason: It is pushed in a subdirectory. I don't know why this file could not be processed in the original directory and then moved down, but anyway.
Solution: Warn the users more clearly about .. in texinputs, providing a test for it, and about not having relative path names. Pause the screen if latex returns a warning on images.tex and copy the files to the Temp directory for examination.
Reason: Beats me.
Solution: Warn the user.
Reason: The numbers are retrieved from latex for the given title. Unfortunately, Latex and html do not always agree on what is the title.
Solution: (1) Added a last resort search that first removes embedded mathematics from the titles, that makes the tex and html agree on the &, (very common in legends because of references), and that prints to the log file to show why the search fails, if it still does. (2) If this last resort fails too, increment the previous number by one, which works most of the time. Warn the user, anyway. Also add a warning to problem shooting not to use discretionary hyphens, they cannot be fixed without also affecting true hyphens.
\input{abstract}
as
\begin{abstract}
, making a mess.
Reason: Unknown. It does input the file.
Solution: Warn the user to not make input file names look like environments.
Reason: Don't know.
Fix: Set default to have starred section titles go to toc.
Reason: Unknown.
Solution: None. (This does not seem to be much of a problem since it gets its numbers from latex anyway.)
Reason: The crop option is not passed to pstoming.
Solution: Add it.
Reason: The figure and caption are put in a html table. The browsers then make the table as wide as the figure, which can be very small.
Solution: Specify the table to be 65% of the page width.
Reason: Perl asks for messy code, and not providing full comments in the code does not help. The code is also full of adhoc fixes, not just mine.
Solution: None.
Reason: That is how it is.
Solution: Warn the user to split it into two figure environments.
Reason: Yes.
Solution: Warn the user to put it in a caption, or live with it.
Reason: It is a unix script running Perl running pamflip.
Solution: Provide the earlier netpbm utility for Windows, include pamflip in Unix.
Reason: Seems to be subroutine that removes duplicate entries.
Solution: Removed.
Reason: Unknown
Solution: Commented out the lines, which should not be needed anyway.
Reason: A sharp is missing in the end of equation marker in versions/math.pl, so these markers stay in. How was the manual on the web made??
Solution: Added the sharp.
Reason: Unknown.
Solution: The LATEX2HTML-FU remake fixes some of the problems. Nothing can be done about links that have disappeared.
Reason: Incorrect perl.
Solution: Renamed the subroutine not to overwrite the user's.
Reason: The Up link out of the document is loaded into a frame.
Solution: Maintain the external link at TARGET="_top"
Reason: Windows XP returns a bit set instead of a bit-cleared status.
Solution: Took out the status test.
Reason: Unknown.
Solution: Added a note to the problem solving section to unset TEXDEFS.
Reason: graphics_support.perl wants it, but the true version no longer exists.
Solution: It seems unlikely to be of much importance unless you include non eps graphics. I just added a note in the graphics problem solving section.
Reason: The authors of MiKTeX 2.4 lost the entire environment, including TEXINPUTS. They plan to fix it in 2.5, out some time in summer 2006.
Solution: Install script l2hinm24 creates a local miktex.ini that points dvips to the files. Latexl2h.m24 (.bat) uses a command line option to do the same with latex.
Reason: Another alignment bar bug. Probably due to the newer dvips. Ghostscript produces a sidebar that is too long.
Solution: Fixed that one too in l2hcrop.
Reason: While TeX was fixed, dvips was not. This was fixed in beta 8.
Solution: Install script l2hinmkt appends to dvips.ini in the main config directory for the earlier versions of 2.5.
Reason: Firefox seems to have switched to w3c recommended alignment.
Solution: Added this cropping too to l2hcrop. Made it the default instead of the Mozilla alignment.
Reason: My stupidity and/or lack of careful examination of the differences.
Solution: Both versions of the two files are now provided in a single LATEX2HTML directory, and l2h and makel2h swap in the appropriate versions.
Reason: Unknown. I am guessing an internal difference between dvips 5.83 and 5.85. Pstoimg.bat cannot find the page dimensions in the new version. But they are there the same way.
Solution: NOEPS is only used as a fix for problem figures. Noted that it does not work in new versions of TeX in the corresponding problem shooting subsubsection. The other, better, solution will have to be used. A solution of the pstoimg problem goes on the wishlist.
Reason: Unknown.
Solution: Curse. Put in a few s.
Reason: Unknown.
Solution: Curse. Copy it over into $RUNNING_TITLE.
Reason: A test that ppmtojpeg was an executable file. Sometime after Shankar’s thesis, I took out the path from the executables.
Solution: Drop this test. When installed according to instructions, ppmtojpeg is there. Otherwise, do not use L2H_JPGQ.
Reason: I guess it did not seem important to me at the time.
Solution: Added.
Reason: I did not think that thumbnail entries in images.pl would be different.
Solution: Fixed.
Reason: I am teaching math??
Solution: Fixed.
Reason: This is an html table limitation. Html 4.0 used a dirty trick to improve things a bit.
Solution: Put in a better dirty trick.
Reason: The obvious reason is that it is applied to the image itself, not to the table containing it. Whether there is a deeper reason for this??
Solution: Copied it to the containing table, if align=right or align=left is specified in htmlimage in lower case, with no quotes. If that is interfering with something I do not know about, just change case.
Reason: The alignment bar thickness for larger magnifications increases beyond the limits used in l2hcrop.
Solution: Made the bar thicknesses fontsize dependent.
Reason: By design, LATEX2HTML only looks at the first 80 characters of the latex source code of a figure or equation to see whether it is different from another. In addition, while doing so it will completely ignore all whitespace, braces, backslashes and more. (Look at the keys in an images.pl file to see what I mean.)
Solution: As of Sep. 2007, the keys now include a 32 bit CRC check sum of the entire latex source code of the figure or equation.
eqnarray*
environments.
Reason: For reasons unknown to mankind, these environments were terminated by three line breaks and additionally three blank paragraphs!
Solution: Eliminated two breaks and two empty paragraphs.
\citeindextrue
does not make any links
to in-text citations, just to items in the bibliography.
Reason: In natbib.perl, someone has disabled links to in text citations, "if desired", unconditionally. Apparently, disabling them is always desirable in the view of this person.
Solution: Killed off the line that killed of the links.
\citeindextrue
does not work with
numerical references.
Reason: In natbib.perl, someone turned it off for numerical references.
Solution: Killed off the code that turned it off. If
\citeindextrue
works for LaTeX, it should also work for
LATEX2HTML. (However, the references in the index will show
authors and year, rather than authors and citation number.)
Reason: LATEX2HTML makes just a guess at \textwidth
and \textheight
. So they are inconsistent with the document.
Solution: Latex2html now takes these values, in pt, from
$TEXTWIDTH
and $TEXTHEIGHT
if set in
.latex2html-init.
\end{CJK}
.
Reason: CJK.perl assumed the input encoding is in pairs of bytes, which is untrue for UTF-8.
Solution: Skipped the relevant pre-pre-processing if
$CJK_AUTO_CHARSET
is set to UTF-8. LATEX2HTML seems to
handle it fine by itself.
Reason: Like linux developers, Perl developers hate backward
compatibility. Their motto, too, is if it works, fix it.
Solution: Removed defined
commands, added backslashes,
changed substitution command modifiers, etcetera.
Reason: Like linux developers, Perl developers hate backward
compatibility. Their motto, too, is if it works, fix it.
Solution: Found a regexp using a \G
anchor in latex2html
that removes comments but no longer worked correctly under Perl 5.22
(without error messages being generated). Rewrote the regexp to be
far less convoluted, making it work again.