XML Diff?
-
I have to compare some sets of XML for differences. With a text file it'd be easy--use Beyond Compare or any other diff tool, but these XML, like many, are all on one single line, so the tools break down (1 out of 1 lines differs between the two files!).
Anyone know of a tool to make this easier, that would maybe show me highlights of just the different portions of each file?
-
@FrostCat perhaps if you pretty-print and then diff?
-
@anotherusername Most of what's mentioned on that page seems to be *nix only, but I'll look.
-
@FrostCat I'll write an app for 4995.
-
-
I'll write an app for 4995.
If that price is in mills, I'll take you up on it. Otherwise, I don't need it urgently enough to spend money on it (so paid utils are out, too.)
Windiff or similar would be fine if the XML wasn't all on one damn line, but was pretty-printed.
-
@anotherusername said in XML Diff?:
There's a C# code sample and a PowerShell script.
That looks like it might just do what I want, thanks.
-
Windiff or similar would be fine if the XML wasn't all on one damn line, but was pretty-printed.
Well, we wouldn't want to waste any spare bytes on newlines and spaces, would we?
Never mind the fact that we're using a format that duplicates every tag name.
-
@FrostCat More seriously, why not just use a printy printing library then?
-
@FrostCat Visual Studio will "pretty print" (I hate that term) XML. Open up the file and hit Control-E, D.
-
@blakeyrat said in XML Diff?:
@FrostCat Visual Studio will "pretty print" (I hate that term) XML. Open up the file and hit Control-E, D.
I imagine it comes from Unix people who prefer proper all-on-one-line xml, and consider such a function frivolous.
-
More seriously, why not just use a printy printing library then?
Because I wasn't aware of any until just now. I don't normally work much with XML.
-
@blakeyrat said in XML Diff?:
"pretty print"
"reformat" or whatever the VS term is--I don't have it open at the moment--is fine by me, I'm just using what other people say.
I didn't even think of doing that but I'll give it a try, thanks. For my purposes this would be easier than the other answer.
ETA: for some reason the keybings are different in my VS--I had to use ^K^D. But it worked perfectly--I had two sets of files that differed by having the contents of about 1% of the tags differ.
-
I imagine it comes from Unix people who prefer proper all-on-one-line xml, and consider such a function frivolous.
XML is usually meant to be machine-readable, isn't it? Human-readable formatting wouldn't be necessary in "most" cases, I guess, and would just bloat the file, would be the argument.
-
@FrostCat so they made a file that is much bigger than a binary file, and still not human readable
-
@FrostCat XML is relatively human-readable as well though: It's crazy verbosity is a benefit in some cases, because the context is always visible, unlike JSON for example.
But anyway, shut up and let me make my bad joke at unix fanatics' expense!
-
ETA: for some reason the keybings are different in my VS--I had to use ^K^D. But it worked perfectly--I had two sets of files that differed by having the contents of about 1% of the tags differ.
I think they changed it from 2013 -> 2015. K D is actually more logical given the other shortcuts in the Edit -> Advanced menu.
-
XML is relatively human-readable as well though:
Sure, if you format it so it's not all on one big line. And you don't use stupid tag names or namespaces.
-
-
Sounds like it's solved now, but notepad++ has an xml plug in that will pretty print
-
XML is usually meant to be machine-readable, isn't it? Human-readable formatting wouldn't be necessary in "most" cases, I guess, and would just bloat the file, would be the argument.
Your argument would make sense if XML was binary format.
-
Your argument would make sense if XML was binary format
Lots of formats that aren't "binary" aren't designed for readability. For example, pretty much any fixed-with file format (like a GL extract) or the tax/unemployment information that goes to the state: 512 byte rows of gibberish unless you have the key.
Any file that can't be read in Notepad easily (because it's all on one line) isn't particularly readable, even if it's technically readable.
-
Lots of formats that aren't "binary" aren't designed for readability.
Depends on what you mean by binary. For example, to me, Base64-encoded ZIP archive is binary format.
Any file that can't be read in Notepad easily (because it's all on one line) isn't particularly readable, even if it's technically readable.
I never said all text formats are readable.
-
Any file that can't be read in Notepad easily (because it's all on one line) isn't particularly readable, even if it's technically readable.
You wait until you see the back-end output from some of my programs. It's CSV and so it's one record per line, but nearly all the fields consist of long URIs without much readable nature about them. Human readable? Yes, but actually no.
-
You wait until you see the back-end output from some of my programs.
This sounds like you're going to disagree with me, but then you wrote something that doesn't.
-
I never said all text formats are readable.
I'm not even sure what your point is any more.
-
Any file that can't be read in Notepad easily (because it's all on one line)
You might want to fix your line endings...
Honestly I don't think that I've ever seen a serializer that would spit out unindented XML. JSON, yes, all the time, but never XML.
Which gives me an idea - deserialize, and then reserialize with a three-line C# or Java app?
-
@Maciejasjmj said in XML Diff?:
You might want to fix your line endings..
I don't think the tool I am using supports doing that.
Normally it wouldn't matter. But I had a couple of minor problems with the first version of an interface program I wrote, and the easiest way to see if I fixed it is to be able to look at a couple of nodes side-by-side
-
@FrostCat I don't think that is true, it is verbose but quite easy to read when formatted correctly. The same would be the true with JSON.
-
@blakeyrat Ctrl K + D has been there since 2010.
-
when formatted correctly.
So, not the case I was working with, nor what seems to be the common case with generated XML?
-
I'm not even sure what your point is any more.
That XML manages to be extremely bloated yet not very human-readable. The worst of both worlds.
-
@Gąska Oh, well, yes. So you weren't disagreeing with me.
@blakeyrat's suggestion to autoformat the file works for what I needed, at the cost, as I said, of bloating the file quite a bit. (Since I'm at home now I can't check the size difference.)
-
@FrostCat either you use XML or you care about size. There's no middle ground.
-
either you use XML or you care about size.
I don't care, I was just noticing.
-
@FrostCat your choice of words (specifically "cost") made me think it matters to you how big the file is.
-
your choice of words (specifically "cost") made me think it matters to you how big the file is.
Oh. No, it was just an observation. I'm only doing this to save myself some time--the only way, ultimately, to verify my changes are correct are to submit them to the API, and it takes a LONG time to get back a response. If I can fairly easily validate a few things on my own, I can save myself a bunch of time. By which I literally mean days.
-
@Maciejasjmj said in XML Diff?:
Honestly I don't think that I've ever seen a serializer that would spit out unindented XML. JSON, yes, all the time, but never XML.
Unindented was the default in something I used once. Some c++ library.
-
I'm not even sure what your point is any more.
The usual end result from an argument with @gaska
-
@Maciejasjmj said in XML Diff?:
Any file that can't be read in Notepad easily (because it's all on one line)
You might want to fix your line endings...
Honestly I don't think that I've ever seen a serializer that would spit out unindented XML. JSON, yes, all the time, but never XML.
Which gives me an idea - deserialize, and then reserialize with a three-line C# or Java app?
That's a blatant lie. The try catch you'll need is already 6 lines.
-
-
@Maciejasjmj said in XML Diff?:
@DogsB Can't you put
throws WhateverException
onmain()
?
Filed under: don't try this at home kids
Why yes! I'm actually doing it now.
-
The try catch you'll need is already 6 lines.
try{DoXMLStuff();}catch(exception up){throw up;}
-
@FrostCat XML is a compromise between machine-readable and human-readable that (IMO) ends up sucking for both.
-
@Maciejasjmj Is there anything wrong with that? If I don't really have any way to handle the exception I'll obviously
throw
it to the JVM to crash in an appropriate manner.