<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Recent changes to bugs</title><link>https://sourceforge.net/p/pdftohtml/bugs/</link><description>Recent changes to bugs</description><atom:link href="https://sourceforge.net/p/pdftohtml/bugs/feed.rss" rel="self"/><language>en</language><lastBuildDate>Tue, 19 Jul 2016 00:37:33 -0000</lastBuildDate><atom:link href="https://sourceforge.net/p/pdftohtml/bugs/feed.rss" rel="self" type="application/rss+xml"/><item><title>#94 Missing letters from converted PDF</title><link>https://sourceforge.net/p/pdftohtml/bugs/94/?limit=25#d72b</link><description>&lt;div class="markdown_content"&gt;&lt;p&gt;Is this project dead? It is almost ten years and this is still not fixed.  :(&lt;/p&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Daniel Bair</dc:creator><pubDate>Tue, 19 Jul 2016 00:37:33 -0000</pubDate><guid>https://sourceforge.neta77cf72f682e8c975e21732d3f875de0c1d5a253</guid></item><item><title>wrong XML open/close &lt;a tag in xml generation</title><link>https://sourceforge.net/p/pdftohtml/bugs/97/</link><description>&lt;div class="markdown_content"&gt;&lt;p&gt;The opening tag is &amp;lt;A while the closing tag is &amp;lt;a.&lt;br /&gt;
This is not valid xml.&lt;/p&gt;
&lt;p&gt;Thank you for this great application by the way.&lt;/p&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Louis Hoefler</dc:creator><pubDate>Wed, 27 Oct 2010 15:17:49 -0000</pubDate><guid>https://sourceforge.net55fb466a3cc9dd071751ce9b780fa158e5d5af85</guid></item><item><title>Fails to extract most of text from angled magazine columns</title><link>https://sourceforge.net/p/pdftohtml/bugs/96/</link><description>&lt;div class="markdown_content"&gt;&lt;p&gt;When trying to extract test from angled magazine columns, leading words of paragraphs may be extracted, but most of the text is lost.  Errors 'Illegal entry in bfrange block in ToUnicode CMap'&lt;/p&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">MBM</dc:creator><pubDate>Tue, 12 Jan 2010 12:13:48 -0000</pubDate><guid>https://sourceforge.netbb184ec952e5f9fd99e3b147c3ceb962dd8a2bd3</guid></item><item><title>Title property is not populated into all &lt;title&gt; tags</title><link>https://sourceforge.net/p/pdftohtml/bugs/95/</link><description>&lt;div class="markdown_content"&gt;&lt;p&gt;I have seen this in many PDF documents, KPDF will show a title in the file properties that is not populated into all &amp;lt;title&amp;gt; tags, only the one on the .html file, but many people needs this in the _s.html file as well. When indexing PDF document, most people use the _s.html file as this is the one that has content, the .html file has only frameset. I have attached a document that is publicly available at info:http://www.scubenewmedia.biz/Modulo_domanda_riduzione_tasse_2008_.pdf (in case the document goes away some day)&lt;br /&gt;
Versions: 0.33a and 0.36&lt;/p&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Miguel Ángel Vilela</dc:creator><pubDate>Wed, 04 Feb 2009 11:05:07 -0000</pubDate><guid>https://sourceforge.net5d0ad28031b5b66679389aaae8acdcdea6778d3a</guid></item><item><title>Missing letters from converted PDF</title><link>https://sourceforge.net/p/pdftohtml/bugs/94/</link><description>&lt;div class="markdown_content"&gt;&lt;p&gt;After converting a simple PDF, I find a number of letters are missing, most notably the second 'l' in double-l words like 'fellow'.&lt;/p&gt;
&lt;p&gt;As far as I can tell, the missing characters are represented normally if the PDF file. Certainly, I can copy and paste the relevant text from Acrobat into Text Edit? and the letters show up as expected.&lt;/p&gt;
&lt;p&gt;I have attached a one-page sample file that demonstrates the issue.&lt;/p&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Kevin Yank</dc:creator><pubDate>Thu, 18 Dec 2008 04:59:56 -0000</pubDate><guid>https://sourceforge.net4760546dc5b531d7c45e1f92bccd17c3438c96c1</guid></item><item><title>Building fails with GCC 4.2.3</title><link>https://sourceforge.net/p/pdftohtml/bugs/93/</link><description>&lt;div class="markdown_content"&gt;&lt;p&gt;make[1]: Entering directory `/home/milanb/devel/temp/pdftohtml-0.39/src'&lt;br /&gt;
g++ -g -O2 -DHAVE_CONFIG_H -DHAVE_DIRENT_H=1  -I.. -DHAVE_REWINDDIR=1 -DHAVE_POPEN=1 -I.. -I../goo -I../xpdf -I../fofi -I../splash -I           -I/usr/X11R6/include -c HtmlOutputDev.cc&lt;br /&gt;
In file included from HtmlOutputDev.h:22,&lt;br /&gt;
from HtmlOutputDev.cc:31:&lt;br /&gt;
HtmlLinks.h:22: error: extra qualification 'HtmlLink::' on member 'isEqualDest'&lt;br /&gt;
HtmlOutputDev.cc: In member function 'void HtmlPage::dumpComplex(FILE*, int)':&lt;/p&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Anonymous</dc:creator><pubDate>Thu, 02 Oct 2008 16:19:02 -0000</pubDate><guid>https://sourceforge.netbdb87227843acf978fe0cfd9d469fd1b278d16c5</guid></item><item><title>it crashes when filename has % in it.</title><link>https://sourceforge.net/p/pdftohtml/bugs/92/</link><description>&lt;div class="markdown_content"&gt;&lt;p&gt;I try this on both 0.36 and 0.40a.  When I put "pdftohtml -c filename%1.pdf" or "pdftohtml -c filename%1.pdf filename%1.html", it craches with following errors.  No error if I rename the file to filename1.pdf.&lt;/p&gt;
&lt;p&gt;$ pdftohtml -c tb-%027.pdf&lt;br /&gt;
Page-1&lt;br /&gt;
Page-2&lt;br /&gt;
Page-3&lt;br /&gt;
Page-4&lt;br /&gt;
Page-5&lt;br /&gt;
Page-6&lt;br /&gt;
Page-7&lt;br /&gt;
Page-8&lt;br /&gt;
Page-9&lt;br /&gt;
Error: /undefinedfilename in --showpage--&lt;br /&gt;
Operand stack:&lt;br /&gt;
1   true&lt;br /&gt;
Execution stack:&lt;br /&gt;
%interp_exit   .runexec2   --nostringval--   --nostringval--   --nostringval--   2   %stopped_push   --nostringval--   --nostringval--   --nostringval--   false   1   %stopped_push   1889   1   3   %oparray_pop   1888   1   3   %oparray_pop   1872   1   3   %oparray_pop   1755   1   3   %oparray_pop   --nostringval--   %errorexec_pop   .runexec2   --nostringval--   --nostringval--   --nostringval--   2   %stopped_push   --nostringval--   1761   0   5   %oparray_pop   --nostringval--   --nostringval--&lt;br /&gt;
Dictionary stack:&lt;br /&gt;
--dict:1153/1684(ro)(G)--   --dict:0/20(G)--   --dict:93/200(L)--   --dict:65/75(L)--   --dict:18/25(L)--&lt;br /&gt;
Current allocation mode is local&lt;br /&gt;
Last OS error: 2&lt;br /&gt;
Current file position is 38845&lt;br /&gt;
GPL Ghostscript SVN PRE-RELEASE 8.61: Unrecoverable error, exit code 1&lt;br /&gt;
Error: Failed to launch Ghostscript!&lt;/p&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">dealmaker</dc:creator><pubDate>Sat, 28 Jun 2008 04:49:51 -0000</pubDate><guid>https://sourceforge.net2390b69b02a43fc87b042b6bdb3b92c363a80171</guid></item><item><title>half-baked when convert pdf to xml using pdftohtml-0.39-win3</title><link>https://sourceforge.net/p/pdftohtml/bugs/91/</link><description>&lt;div class="markdown_content"&gt;&lt;p&gt;I have been puzzled by the issue.When I use pdftohtml-0.39-win32 to convert pdf to xml (command:pdftohtml.exe -xml jstl-1_0-fr-spec.pdf test)&lt;br /&gt;
then sometimes it can be converted perfectly,but mosttimes it is done partly and miss some information like:&lt;br /&gt;
&amp;lt;text top="211&lt;br /&gt;
tion&amp;lt;/b&amp;gt;&amp;lt;/text&amp;gt;&lt;br /&gt;
&amp;lt;text top="143" left="168" width="23" height="10" font="23"&amp;gt;key&amp;lt;/text&amp;gt;&lt;/p&gt;
&lt;p&gt;I try another pdf files, the issue remains sometimes.&lt;br /&gt;
Anybody has the solution,contact me:&lt;br /&gt;
huamarco@gmail.com&lt;br /&gt;
Thank you very much :)&lt;/p&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Marco</dc:creator><pubDate>Fri, 06 Jun 2008 10:28:38 -0000</pubDate><guid>https://sourceforge.net68cd1f8608453d3064f0434d79f2e19d42e1926a</guid></item><item><title>Fix for unquoted &lt; &gt; in XML/HTML output</title><link>https://sourceforge.net/p/pdftohtml/bugs/90/</link><description>&lt;div class="markdown_content"&gt;&lt;p&gt;In HtmlFont.cc, the HtmlFilter function does not quote " &amp;amp; &amp;gt; &amp;lt;, if returned from the mapUnicode function.&lt;/p&gt;
&lt;p&gt;I've patched that function to first map unicode and then quote the forbidden characters in xml.&lt;/p&gt;
&lt;p&gt;The method is marked with: // this method if plain wrong todo &lt;br /&gt;
... anyway.&lt;/p&gt;
&lt;p&gt;Probably related bug reports:&lt;br /&gt;
874352&lt;/p&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">jensq</dc:creator><pubDate>Fri, 30 May 2008 11:52:13 -0000</pubDate><guid>https://sourceforge.net9eeb46e62896925430f991462774c7d914694b0d</guid></item><item><title>letter case of closing tags in windows binary</title><link>https://sourceforge.net/p/pdftohtml/bugs/89/</link><description>&lt;div class="markdown_content"&gt;&lt;p&gt;I see this has been fixed in the CVS but for some reason this fix hasn't made it to the windows 0.39 release.&lt;/p&gt;
&lt;p&gt;&amp;lt;A&amp;gt;&amp;lt;/a&amp;gt;&lt;/p&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Benjamin Faden</dc:creator><pubDate>Wed, 14 May 2008 13:45:42 -0000</pubDate><guid>https://sourceforge.net7de7eeb43f56a13547abcd9f3e16ba5f8c41f879</guid></item></channel></rss>