Aspose.PDF is Hard
-
But then, all documents are. I don't feel like writing down my question again, so I'll post the one I posted on their forum, and remind you all of an old friend:
They aren't very fast, and their forum seems to be an unformatted text dump of an old forum. They'll probably just ask me to copy paste my code, instead of explaining the solution to me. But it's worth a shot.
Documents are difficult in general, though.
-
From Aspose earlier this year:
I would like to share with you that the feature to avoid content overlapping is not supported yet. And the ticket ID PDFNET-38635, formerly PDFNEWNET-38635, was logged for the same feature request.
Are you 100% commited to Aspose [I hate it]???
-
@thecpuwizard I am not, however our company pays for it, and last time I used something else to make a spreadsheet, people complained that I wasn't using Aspose.
But at the same time, it does most of what I need fairly well, apart from being obnoxious.
As far as them talking about things not being supported, however, I don't believe it. I think they're answering wrong. I can make word documents and convert them to PDF if I have to, but I seriously don't believe there's no control at all over the area things get added to before overflowing. If I could shrink that, my problem would instantly be solved.
-
Can you set the margins larger to work around the issue?
-
@jaloopa The problem is that while I can increase the margins on paragraphs, that essentially means I have to manually design every page, when my goal is to use the automatic overflow for that, so I can focus on just putting all the elements I need into the document.
-
Does your company also pay for Aspose.Words?
Ok, I'll admit it is a long shot, but it is quite good at overflowing text in a Word document. Converting it to PDF when you're done is also no real issue.
-
@jbert Yeah, and that's what everyone here keeps telling me to do. After I've already got this around 80% implemented. If that's what I have to do, I'll go do it, but I refuse to believe this can't be done.
-
So I guess the problem is that the only box that seems to have any effect on the overflow is the Crop Box, and that also decreases the size of the page. I'm sure internally there's more to it than that!
-
I used Aspose.Words on a personal C# project and I ran into thing after thing after thing that simply didn't work. Setting page size on a document would randomly only work on a few pages in the middle of the doc, margin settings would be ignored or do wild things, etc. I ended up using a Word template file and that mostly got me there.
I've never used Aspose.PDF but if it's anything like their Word product, I expect it to be about 50% very useful and 50% blood-boiling frustrating.
-
@mott555 Considering they obfuscate all their non-public property names, their code comments are worthless, and the things their documentation site says are lies, I have a feeling your percentages may be off.
-
@magus said in Aspose.PDF is Hard:
your percentages may be off.
Probably, but my needs were extremely basic. (All I was doing was taking the output of some WPF RichTextBoxes and dumping it into a Word document with just a few specific formatting needs.) I expect the useful percentage to go way down if you try to do anything remotely complicated.
-
@mott555 My needs are minor as well. Headers and footers, and content that flows across pages, as a PDF. You'd think that that would be both easy and absolutely supported. Apparently not. In a system that specifically has header and footer objects.
So far, the most interesting thing I've learned is that all
Rectangle
instances are the same size, even when they are not, and are visibly not. They will always show the same values in the debugger. Also, the only property that seems to affect the overflow isCropBox
, which also shrinks the page.AAAAAAAGH!!!
-
Bonus:
-
@magus That's a long and convoluted way to say "Deprecated. Use 'Box' instead."
-
@the_quiet_one It's actually worse than that because the "one year later" adds a lot of uncertainty. One year from WHEN?
-
@blakeyrat I'm starting to wonder if any of this was not outsourced at this point.
-
So apparently their documentation is a bundle of lies.
That blue square? it's supposed to be half that size (and I've gone much smaller than that even!) and on top of the grey table.
It does nothing. Why is this all so bad!? Why do people pay for this!?
-
@magus said in Aspose.PDF is Hard:
It does nothing. Why is this all so bad!? Why do people pay for this!?
We pay for it, but only for PDF conversion, not construction. We use OfficeWriter to construct Word/Excel docs then use Aspose to generate PDFs.
-
Let me put it this way: I live and breathe PDF. Aspose blows.
All PDF libraries except the Adobe one that costs more than you can afford have limitations. Most of them are stupid limitations.
Each one also has a specialty.
Except aspose. I haven't figured out what it's actually good at yet
-
@weng Why are PDF libraries so expensive/painful in general? Is PDF that much of a pain to compose? Or are paginated documents themselves a pain to work with?
-
@benjamin-hall said in Aspose.PDF is Hard:
Is PDF that much of a pain to compose? Or are paginated documents themselves a pain to work with?
I would answer that with yes and yes.
-
It's a "documents in general" thing.
It's easy if you can do things in a fixed way (at coordinates x put object y) but as soon as you need to start knowing things about the object you're placing to make the right decision, complexity increases exponentially.
Just to make a string wrap, you need to dive into font rendering. As TTF, OTF and PS fonts are all Turing complete, you're fucked.
When you're generating a PDF, it's best not to think of it as a document. It's really executable code in a Turing complete language with a bit of metadata and several other Turing complete languages attached to it.
A library isn't making a simple document. It's a fuckin' code generator.
-
@weng This.
Documents are what my company does, for the most part, and I can tell you all that they are horrible and complicated.
-
I thought this was a solved problem. Doesn't LaTeX do everything anyone could need? /
-
@benjamin-hall For long form documents like textbooks, that's pretty much the best solution.
-
@weng I wrote my dissertation in LaTeX. For all the pain, it beat using Word (shudder) hands down.
Is automatic generation of LaTeX so hard as to be infeasable?
-
@benjamin-hall said in Aspose.PDF is Hard:
Is automatic generation of LaTeX so hard as to be infeasable?
Automatic generation is feasible, but I donโt think it would work for most real-world mailings. (That, or I donโt know enough about LaTeX.) I donโt think you can tell it โhey, put my address at exactly X inches down/over so it fits in the window on the envelopeโ. Even if you can, LaTeX documents tend to have large font sizes, margins, and space between lines; that means more pages in the document, which means it costs our clients more money to send.
-
@unperverted-vixen Font sizes and margins are just defaults, and are part of the reason LaTeX documents are so much easier to read.
I'm pretty sure positioning specific blocks at specific positions on the page is possible. But I wouldn't know how.
-
@benjamin-hall said in Aspose.PDF is Hard:
@weng I wrote my dissertation in LaTeX.>
It would be more interesting to find someone who wrote their dissertation on LaTex
-
MigraDoc seems to be better than Aspose, and can put images on top of tables! But you have to build the latest version from source, because they haven't updated the nuget package since 2013.
-
Okay, so, next time you need to make a PDF in .NET, do yourself a favor and skip Aspose. Go to Git, find MigraDoc, build it, and use that. Reasons:
- When you try things, they work!
- When they don't, StackOverflow has your answers!
- The documentation makes sense and isn't all false!
Downsides:
- Images, in order to not round-trip through the filesystem, have to be encoded into a string of the format
base64:<your encoded image>
and passed in as theImage
class'sfilename
parameter. - You have to build it.
-
@benjamin-hall said in Aspose.PDF is Hard:
@weng I wrote my dissertation in LaTeX. For all the pain, it beat using Word (shudder) hands down.
Is automatic generation of LaTeX so hard as to be infeasable?
What's wrong with Word?
-
@pie_flavor said in Aspose.PDF is Hard:
@benjamin-hall said in Aspose.PDF is Hard:
@weng I wrote my dissertation in LaTeX. For all the pain, it beat using Word (shudder) hands down.
Is automatic generation of LaTeX so hard as to be infeasable?
What's wrong with Word?
120 pages of figures, equations, and very picky formats. That's what's wrong.
-
@pie_flavor said in Aspose.PDF is Hard:
@benjamin-hall said in Aspose.PDF is Hard:
@weng I wrote my dissertation in LaTeX. For all the pain, it beat using Word (shudder) hands down.
Is automatic generation of LaTeX so hard as to be infeasable?
What's wrong with Word?
It's not that it can't work but it does tend to struggle with larger documents, and if you're unlucky it might even crash.
I have a document with 45 pages containing lots of tables for our API documentation and it freezes every so often. That's on a PC with 16 GB RAM, an Intel i7 and enough disk space.
-
@benjamin-hall I'm missing what about that is wrong.
https://i.imgur.com/JyNPFYl.png
https://i.imgur.com/gSJ1K4H.png
https://i.imgur.com/a5Wd39x.png
-
@pie_flavor said in Aspose.PDF is Hard:
@benjamin-hall I'm missing what about that is wrong.
https://i.imgur.com/JyNPFYl.png
https://i.imgur.com/gSJ1K4H.png
https://i.imgur.com/a5Wd39x.pngThe year being 2010? That's when I wrote it. And the picky formats, for which I had a LaTeX style?
-
@benjamin-hall Word 2007 could do LaTeX. And what do you mean by picky formats?
-
@pie_flavor said in Aspose.PDF is Hard:
@benjamin-hall Word 2007 could do LaTeX. And what do you mean by picky formats?
Typing anything non trivial in the old equation editor was painful and corruption prone. I had a bunch of buddies fall prey to that. And I had hundreds of them, some of which were multiline, full width (in a two column document) monsters.
Dissertation committees are super picky about exact formatting. As in they sent it back for an edit because I had two spaces after a period. In one reference. Out of dozens. Near the end of 120+ pages.
It's not just things like margins, fonts and stuff. It's exact page layouts and hyper text table of contents.
-
@benjamin-hall said in Aspose.PDF is Hard:
120 pages of figures, equations, and very picky formats. That's what's wrong.
Ack. Even a couple of pages of figures and very picky formatting can have you tearing your hair out. Word makes it astonishingly difficult to position images (or any such blocks) precisely and keep them where you put them.
If any of you have never had the pleasure, blocks are attached to text (in a way that is not fully under the user's control), and anything that causes text to move even slightly (say, fixing a typo) may cause attached blocks to jump around, perhaps attaching themselves to a different piece of text, especially if the block is positioned at the top or bottom of a page or column.
-
@hardwaregeek and I had multipart figures with captions (and very picky format requirements).
Another advantage was that I could keep the chapters separate and compile them individually as well as in bulk. And LaTeX handles section/subsection numbers very easily, as well as references. With bibtex it even auto formats them for whatever style you need.
-
@hardwaregeek said in Aspose.PDF is Hard:
Word makes it astonishingly difficult to position images (or any such blocks) precisely and keep them where you put them.
-
@pie_flavor I still fight this every time I make tests. Because word tries to be helpful. Like clippy.
Don't get me wrong. Word is pretty good. But for some uses it's not the best. Like page layouts.
-
@benjamin-hall said in Aspose.PDF is Hard:
and very picky formats.
Well if you want the most absolute control over format...
-
@pie_flavor said in Aspose.PDF is Hard:
@benjamin-hall said in Aspose.PDF is Hard:
@weng I wrote my dissertation in LaTeX. For all the pain, it beat using Word (shudder) hands down.
Is automatic generation of LaTeX so hard as to be infeasable?
What's wrong with Word?
I had to manually implement the ToC and ToF for my dissertation because somehow the Word ones had become thoroughly broken. It was at the last minute so I have an phantom appendix listed...no-one seemed to notice.
Also, bloody list-numbering. It starts in the wrong place, won't start at all, you get duplicates, it worked then changes while you aren't looking....
-
@cursorkeys said in Aspose.PDF is Hard:
Also, bloody list-numbering. It starts in the wrong place, won't start at all, you get duplicates, it worked then changes while you aren't looking....
Trigger warning please! That's a constant battle using WORD 2016! when I'm writing tests (which are nested lists). It can't do it consistently, not even getting the right indentation.
-
@benjamin-hall said in Aspose.PDF is Hard:
Typing anything non trivial in the old equation editor was painful and corruption prone.
I took a couple semester's worth of linear algebra notes in Word 2007, matrices and all. But then again I didn't have to worry too much about formatting.
-
@hungrier said in Aspose.PDF is Hard:
@benjamin-hall said in Aspose.PDF is Hard:
Typing anything non trivial in the old equation editor was painful and corruption prone.
I took a couple semester's worth of linear algebra notes in Word 2007, matrices and all. But then again I didn't have to worry too much about formatting.
That makes me cringe just thinking about it. Math is best done with a pencil (or an active stylus on a touch screen).
And I've had issues with those old equation formats just plain not working when you transfer them between computers, even if they're running the same versions of Word. Especially if people still used the old .doc format. docx is better, but still has issues if the versions change significantly at all.
-
@benjamin-hall I never ran into anything like that since I only used the one laptop with its copy of Word. But it worked very well for my purposes, and I got quite proficient at entering the "Math autocorrect" shortcuts that I later learned corresponded almost if not 1:1 with their Latex equivalents.
-
So yeah, MigraDoc does actually have a prerelease package of their current version, and so the biggest downside is gone.
-
@pie_flavor said in Aspose.PDF is Hard:
@hardwaregeek said in Aspose.PDF is Hard:
Word makes it astonishingly difficult to position images (or any such blocks) precisely and keep them where you put them.
There are ways to coerce Word into (mostly) doing what you want. But they're very much not the default.
Also, while that looks like Word is cooperating nicely with you, what you're doing is actually fairly simple. Even though you're moving stuff around, the anchor point is mostly staying attached to the same, rather large paragraph. (There are a couple of times, when you're dragging the picture between the paragraphs that it attaches itself to the second paragraph.)
Now try that with short (2 or 3 sentences, like you might find in a typical newspaper or magazine article) in a multi-column layout. And keep the picture fixed at the top of column 2 while you edit the text near the column break. Or notice that the picture at top of the column is not perfectly aligned with the text and try to nudge it up a few pixels. (It's likely to jump to the previous column.) All with someone looking over your shoulder and yelling that you're incompetent because you can't move a picture a 32nd of an inch.
To be fair, Word was the wrong tool for the job. We should have been using something like Publisher, but we had Word, and Publisher was $$$.