Ruby strings "not a real hero"



  • cap = rf.try(:value2)

    This is from a ~1000 line god object, contained within one of ~600 files that used to contain ~9900 (now ~1100) ruby style violations and numerous larger issues like cyclomatic complexity and ABC violations, which I have taken upon myself to refactor because I like pain and sadness, apparently. Oh well, it wouldn't be any fun if it was easy.



  • As someone who knows some ruby, I can assure everyone who doesn't know any ruby that they understand that code just as much as I do.


  • Winner of the 2016 Presidential Election

    Don't worry, @ben_lubar, I don't know anything about ruby but I can explain to you what the code does:
    cap ist a variable. It is a placeholder for something defined later in this code.
    = means the afformentioned variable will get something assigned.
    rf.try(:value2) is the thing being assigned.
    See, it's really simple if you look at it from a distance

    Filed Under: Next time I'll teach you how to draw a perfect (from a distance) recreation of the Mona Lisa :trollface:


  • Discourse touched me in a no-no place

    @Kuro said:

    rf.try(:value2) is the thing being assigned.

    We can do better than that! rf appears to be an object, try a method of that object, and :value2 an argument being passed in. Easy!

    @aapis, why are you upset with this line of code? It's really not obvious at first glance.



  • It means cap = rf ? rf.value2 : nil. Which is nothing like cap = rf? rf.value2 :nil, because ruby is a jerk.
    There is nothing good about that line of code.


  • Winner of the 2016 Presidential Election

    To be honest I wanted to go with that at first as well but knowing nothing of ruby I wasn't whether try was a method of an object or just rubys way of starting a try-catch-block. So I went with the safe route.

    Filed Under: Do you want me to add your analysis to mine, so @ben_lubar has a full referencial post instead of multiple ones?


  • I survived the hour long Uno hand

    The colon means something. It's some ruby concept that I remember staring at for hours when I was trying to learn Ruby and ultimately just deciding to ignore.

    (after a quick Google) ah right, "Symbols", which are like strings but better because ruby magic. I think they're essentially members of a global enum under the hood, but they're described as magic versions of strings.


  • Discourse touched me in a no-no place

    @Buddy said:

    It means cap = rf ? rf.value2 : nil.

    So… they're calling a method on NULL nil? OK, I can see how that's possible, but I'm not convinced that that makes it a good idea.

    @Yamikuronue said:

    "Symbols"

    Now that was something I actually knew. It helps when you're coming from a language that already does that sort of stuff (though without the funky syntactic ceremony).



  • ruby magic == normal behavior
    ruby normal behavior == magic

    Specifically, what I'm talking about is how ruby symbols are like every other language's string literals, while the ruby syntax that looks like a string literal is actually a constructor. That is, "string".object_id != "string".object_id. Stupid ruby.

    We've got some ruby code running at work that I've been too afraid to touch, but I'm pretty much gonna have to. It's broken in pretty much the worst way: As long as nothing bad happens it keeps ticking along, but we've got no testing, no deployment process, just one working copy of the code that completely seizes up any time anything even slightly out of the ordinary happens.



  • @Buddy said:

    Specifically, what I'm talking about is how ruby symbols are like every other language's string literals, while the ruby syntax that looks like a string literal is actually a constructor. That is, "string".object_id != "string".object_id. Stupid ruby.

    Technically that's true of Java and C# as well... it's just that they do string interning on string literals so that "string" == "string"



  • This thread is great for reinforcing my strong dislike of Ruby. I'd rather go back to writing PHP.



  • cap = rf.?value2

    C#TFY



  • Seriously? You think this is ok?

    Am I being punked?

    "Cap" = what is this, I have no idea. Cap of what, who is capped, is someone popping a cap in something?
    ".try" is the lazy man's if/then/else except it only returns the value or nil. It is generally a bad practice to use it. That one requires a bit of Ruby knowledge and may not be immediately obvious.
    "Rf" is the same as cap, a lazy awful variable name that gives no clues as to what it is for. Due to the "transient" nature of Ruby this could be a whole bunch of different things (object, struct, class/instance method) so the fact it looks like an object doesn't really help. I'm not saying you have to do full on Hungarian notation, but something a little more descriptive would be nice.
    "Value2" is a symbol, think of it like a string, whose value is literally value2.

    Everything single part of this is bad.


  • ♿ (Parody)

    @aapis said:

    Seriously? You think this is ok?

    Stuff like short variable names are difficult to judge without more context.



  • @Yamikuronue said:

    "Symbols", which are like strings but better because ruby magic.

    The point is that Symbols are names of things.

    @Buddy said:

    That is, "string".object_id != "string".object_id. Stupid ruby.

    No sane language should guarantee those are equal. Ensuring that is just an unnecessary operation that won't have any effect for many cases, especially those where it is also most expensive. Long literals are usually unique and if they are ever compared to anything, it will be likely dynamically read and therefore not share the same address anyway.

    You are not supposed to compare strings for object id, you are supposed to compare them for equality.

    @powerlord said:

    Technically that's true of Java and C# as well... it's just that they do string interning on string literals so that "string" == "string"

    In C# "string" == "string" because it's == actually tests the expected thing, string equality. It does not say anything about interning.

    As for Java I don't know, but I don't think it actually promises to merge all string literals of the same value across the VM. "string" == "string" will be true, because the compiler is not stupid and will merge the instances in the same translation unit, but I would not be so sure about constants in different components or loaded via different class-loaders and such.

    Python implicitly interns string literals that look like they could be identifiers. I find the Ruby way better here.



  • I know Ruby kinda ok, and it doesn't look that bad to me, at least as just a line.

    Yes, cap and rf aren't particular helpful variable names at first glance. How bad that is depends on where it's used. If they are local vars in a method/block that's a few lines long, then it's not great, but pretty excusable. If they're public variables on an API or something, then it's awful.

    I never heard of a .try method before, so I looked it up - catching exceptions uses a different syntax. Looks like it's a Rails ActiveSupport method added onto Object that will run a method by the name of the first param if it exists, and return nil instead of throwing otherwise. Seems handy for nil coalescing, but I'm a little on the fence about how good it is for things like ActiveSupport to add a pile of "handy" additions to core Ruby classes.

    That means that :value2 is a symbol which is the name of the method that it will try to run on rf. It is rather odd to have a method named value2, but it depends on context as to how bad that is.

    I'd be more worried about the other well-known types of Ruby excessive cleverness, like monkeypatching external Gems and core objects, or abusing method_missing.


  • Discourse touched me in a no-no place

    @Bulb said:

    I find the Ruby way better here.

    It's not clear to me why you should need strings to have identity in the first place, let alone to be able to fetch a description of that identity (the object_id). 🚎



  • @dkf said:

    It's not clear to me why you should need strings to have identity in the first place, let alone to be able to fetch a description of that identity (the object_id). 🚎

    Normally you shouldn't. And you generally don't.

    However symbols are a special case. Symbols are identifiers of variables and members, so in a dynamic language the runtime has to make several comparisons of them and hash lookups by them for each statement. This is so often that doing it by string content would hurt performance. So each symbol is inserted in a global hash table and from then on just the pointers are used for most operations. This is called “interning”.

    This technique is used by all dynamic language machines I've seen (perl, python, ruby, various lisps and schemes etc.) and also by at least JVM. Ruby, lisps and schemes have separate symbol and string types where the former is interned and the later is not while the others have just strings and a method for interning them. And python and perl implicitly intern string literals that “look like identifiers”.

    The technique is also widely used in X11 (XAtom) and supported, but rarely seen in the wild, in Win32 (ATOM) to reduce the number of bytes that have to be transferred between processes.


  • Discourse touched me in a no-no place

    @Bulb said:

    So each symbol is inserted in a global hash table and from then on just the pointers are used for most operations. This is called “interning”.

    There are other ways to handle that, but they're quite a lot more complex. (UNDERSTATEMENT!) The main problem with interning is that it's a memory leak; you cannot un-intern a string.

    Fixing the problem probably requires a redesign of the value model and type system to allow runtime annotation of strings with a garbage-collectable labelling indicating what the token is really understood to be, and it's really challenging to do that sort of thing. It's also very intrusive, with non-trivial consequences. You probably don't want to do it; the rabbit hole is deep, and has dragons hiding in it.



  • @dkf said:

    it's a memory leak; you cannot un-intern a string

    Well, the identifiers loaded from the code are not going away anyway and you are not supposed to use interned strings for much else. That's why I prefer the approach with symbol type; it is more obvious those are special strings for special purpose.

    In most cases strings are interned for any lookup and if they are read, it might indeed be memory-leak. However this claims that string pool is garbage collected in Java 7+. It also says it requires manual tuning to be usable (Java as a compiled language does not use interned strings for variable and member access), so it is pretty crappy in practice.


  • Discourse touched me in a no-no place

    @Bulb said:

    However this claims that string pool is garbage collected in Java 7+.

    Then it's only a partially interned pool. Which is fine; Java can do that (it's got a pretty sophisticated reference model that allows for some very neat tricks) since anything sane that might observe the shenanigans would also stop the GC from reclaiming that entry. But it's all really tricky, and only really useful for code that's got to work continuously in a single process for long periods of time.



  • @powerlord said:

    Technically that's true of Java and C# as well... it's just that they do string interning on string literals so that "string" == "string"

    "string".intern.object_id ==:string.object_id

    @Bulb said:

    No sane language should guarantee those are equal

    No sane language should guarantee that they are different. Ruby makes things that should be an implementation detail into a language feature and calls it “magic”



  • Ruby recently got Symbol GC, which will (of course) only take effect if you do stuff like "method_#{num}".to_sym or foo.class_eval "def method_#{num}\n#{num}\nend" (followed by throwing out the foo object).

    (i.e. symbols referenced directly in the code will have a reference taken to them by the code)


    Fun note: You can also do :"Hello, World!" and that is a symbol constant.


  • Discourse touched me in a no-no place

    @riking said:

    (i.e. symbols referenced directly in the code will have a reference taken to them by the code)

    That's what ought to retain the symbols. On the plus side, if you unload the code, the symbols should go too. Dynamic languages ought to find throwing away the code comparatively easy.


  • FoxDev

    @riking said:

    Fun note: You can also do :"Hello, World!" and that is a symbol constant.

    it's crap like that that makes me lump Ruby in with PERL in the bin of languages that are "you've got to be giving me a at least two Jacksons per minute to work with the language professionally

    i'm sure it's a great language and all that but it makes me want to take a shower after merely groveling through the source code looking for an enum definition.... i shudder to think how much cleaning i'd need to do if i tried to develop in it....

    :-(


  • Discourse touched me in a no-no place

    @accalia said:

    it's crap like that

    As opposed to the way it's accepted normal practice to monkeypatch new stuff into core classes?


  • FoxDev

    that.... that is beyond crap if you ask me....

    i should not have to worry about the behavior of the core libraries suddenly changing on me just because i included another bit of code...

    I have a hard enough time following Ruby as it is, trying to keep track of who is messing with core is just.....



  • Maybe the title ought to be "Ruby Strings should become An Hero".

    Except I like Ruby, at least what I'd seen of it, up until this thread...



  • TDWTF, the best place to learn why $LANGUAGE sucks!


  • Discourse touched me in a no-no place

    @accalia said:

    I have a hard enough time following Ruby as it is, trying to keep track of who is messing with core is just.....

    I don't have a hard time accepting that it is possible — I've overridden parts of libc in the past too — but that it's an everyday thing just strikes me as idiotic.

    I've other gripes about their string handling too (can't be bothered to remember the details right now) and I've not delved into the details much. I do know that some of the stuff with Rails is also amazingly weird, but that might be in part because of the tortured history of some of our internal apps, driven by some parts of management's insistence on declaring initial prototypes to be production ready.

    You can't make everything be hollywood film sets all the way down.


  • FoxDev

    @dkf said:

    I don't have a hard time accepting that it is possible

    neither do i. I've done it quite a bit in my test suite for SockBot. but that's a test suite where you read the test and it goes:

    • before the test go override fs.readFile to give this test output
    • do the test
    • after the test restore the original fs.readFile

    and that's fine and awesome. but doing it to monkey patch bugfixes in as the PREFERRED way to do it.....


  • Discourse touched me in a no-no place

    @accalia said:

    monkey patch bugfixes in as the PREFERRED way to do it...



  • @Bulb said:

    As for Java I don't know, but I don't think it actually promises to merge all string literals of the same value across the VM. "string" == "string" will be true, because the compiler is not stupid and will merge the instances in the same translation unit, but I would not be so sure about constants in different components or loaded via different class-loaders and such.

    Java 8 specification §3.10.5 says this:

    Moreover, a string literal always refers to the same instance of class String. This is because string literals - or, more generally, strings that are the values of constant expressions (§15.28) - are "interned" so as to share unique instances, using the method String.intern.

    I just assumed the same applied to C# as well, although you are correct that C# overrides == on String to check value equality instead of reference equality.



  • It's a pun / topical humor. Donald Trump called John McCain that.



  • @Buddy said:

    It's a pun / topical humor. Donald Trump called John McCain that.

    Oh, I forgot (or maybe didn't know) that. I still think suggesting that Ruby strings should commit suicide is nastier, though.



  • Changing other people's thread titles is kind of inappropriate. An hero is not obscure enough of a reference to drive me to do it the way that internment joke did.



  • Fair enough.


  • Discourse touched me in a no-no place

    @powerlord said:

    Moreover, a string literal always refers to the same instance of class String.

    Strictly, that doesn't disallow you from throwing away an interned String. It just means that if you keep around something which can tell the difference meaningfully, it prevents GC of the thing it is watching. None of which really matters usually, but it becomes significant once you're dealing with processes that need to unload significant amounts of code.



  • @dkf said:

    Strictly, that doesn't disallow you from throwing away an interned String. It just means that if you keep around something which can tell the difference meaningfully, it prevents GC of the thing it is watching. None of which really matters usually, but it becomes significant once you're dealing with processes that need to unload significant amounts of code.

    It did in Java 6 and below as interned strings were made part of the Permanent Generation. They stopped being put in PermGen in Java 7, and PermGen went away entirely in Java 8.


  • Discourse touched me in a no-no place

    @powerlord said:

    PermGen went away entirely in Java 8.

    🎉 🎊


Log in to reply