P
@dohpaz42 said:@pjt33 said:Instantiating a new String instance is not a performance impact. Object creation is cheap, in spite of what anti-GC trolls will tell you. However, copying the contents of the two strings which are being concatenated will get expensive when they get long.
Ah, well, that's good to know. What do you mean by "copying the contents of the two strings..."? Would this not occur for other methods (such as StringBuilder)? I ask in earnest, because I honestly don't know and I would like to understand these sort of things.
What goes on behind the scenes in languages with an immutable string is along the lines of (in a Java/C#-ish pseudocode)
static string op+(string a, string b) {
char[] buf = new char[a.length + b.length];
arraycopy(a.buf, 0, buf, 0, a.length);
arraycopy(b.buf, 0, buf, a.length, b.length);
return new string(buf);
}
If you have m strings each of length n and you add them using + then the first + copies 2n chars, the second copies 3n chars, etc; the last copies mn chars. Overall cost is quadratic in m.
A string builder, on the other hand, has a backing array which it expands exponentially. So the append method would be something like
void append(string a) {
if (currentLength + a.length > buf.length) {
int newLength = buf.length * 2;
while(newLength < currentLength + a.length) newLength *= 2;
char[] newBuf = new buf[newLength];
arraycopy(buf, 0, newBuf, 0, currentLength);
buf = newBuf;
}
arraycopy(a.buf, 0, buf, currentLength, a.length);
currentLength += a.length;
}
There's an absolute minimum of mn character-copies required. The number of additional copies from expanding the buffer depends on its initial size, but we can bound it. If the final length of the buffer is N, expanding it to N copied no more than N/2 characters from a buffer no longer than N/2. We can reason inductively and deduce that so the total number of additional copies is no more than N/2 + N/4 + N/8 + ... = 2N. And from that you can show that the number of character-copies is linear in m.