Split string at second occurrence of "-" (Java)
-
So I've got a list of subdomains for a particular domain, all of which follow a particular pattern:
$prefix-$service
, where the prefix itself is always of the form\W{2}-D*
(two letters, a dash, and any number of digits). I need to grab the prefix and add all the unique ones to a list.Since I'm working on an ancient version of android-java, I don't have access to the set-based interfaces, so I'll have to deduplicate manually. That's doable, but N^2 (N should be small, however, < 10 most of the time, as these are ephemeral).
The issue I'm having (and maybe it's just my brain not braining) is figuring out how to extract the prefix off the front of the subdomain and disregard the
$service
part.Is there a standard, not way of doing this?
-
@Benjamin-Hall Is it just as simple as
domain.split("-[a-zA-Z]")[0]
? The$service
parts always start with letters, and since I don't care about what's in it, that seems to work. On an online playground at least.
-
@Benjamin-Hall said in Split string at second occurrence of "-" (Java):
@Benjamin-Hall Is it just as simple as
domain.split("-[a-zA-Z]")[0]
?Or
domain.split("-",2)[1]
-
@loopback0 said in Split string at second occurrence of "-" (Java):
@Benjamin-Hall said in Split string at second occurrence of "-" (Java):
@Benjamin-Hall Is it just as simple as
domain.split("-[a-zA-Z]")[0]
?Or
domain.split("-",2)[1]
That gives the wrong part.
For an input subdomain of
XX-1234-foo
, that gives1234-foo
when I wantXX-1234
.
-
@Benjamin-Hall Oops. I misread the requirement. My bad.
-
Some combination of
substring
andindexOf
? OrlastIndexOf
iif$service
contains no hyphens.
Wild-ass guess just from looking at the docs:domain.substring(0, domain.indexOf('-', domain.indexOf('-')))
Disclaimer: this post comes without any warranty of correctness or fitness for any particular purpose, including fence-post errors
-
@topspin said in Split string at second occurrence of "-" (Java):
Some combination of
substring
andindexOf
? OrlastIndexOf
iif$service
contains no hyphens.
Wild-ass guess just from looking at the docs:domain.substring(0, domain.indexOf('-', domain.indexOf('-')))
Disclaimer: this post comes without any warranty of correctness or fitness for any particular purpose, including fence-post errors
Off-by-one error, you've forgot to add 1: domain.substring(0, domain.indexOf('-', domain.indexOf('-')+1))
Also, strings without two dashes should be handled:
String prefixToSecondDash(String domain) { final int firstDash = domain.indexOf('-'); if (firstDash < 0) return domain; //or null? exception? depends... final int secondDash = domain.indexOf('-', firstDash + 1); if (secondDash < 0) return domain; //or null? exception? depends... return domain.substring(0, secondDash); }
-
@Benjamin-Hall said in Split string at second occurrence of "-" (Java):
The issue I'm having (and maybe it's just my brain not braining) is figuring out how to extract the prefix off the front of the subdomain and disregard the
$service
part.If you want the first bit, try this:
static String firstBit(String value) { String[] bits = value.split("-", 3); return (bits.length < 3) ? value : (bits[0] + "-" + bits[1]); }
In general, this is the sort of task that you use (correctly!) regular expressions for.
-
@Kamil-Podlesak said in Split string at second occurrence of "-" (Java):
@topspin said in Split string at second occurrence of "-" (Java):
Some combination of
substring
andindexOf
? OrlastIndexOf
iif$service
contains no hyphens.
Wild-ass guess just from looking at the docs:domain.substring(0, domain.indexOf('-', domain.indexOf('-')))
Disclaimer: this post comes without any warranty of correctness or fitness for any particular purpose, including fence-post errors
Off-by-one error, you've forgot to add 1: domain.substring(0, domain.indexOf('-', domain.indexOf('-')+1))
Also, strings without two dashes should be handled:
String prefixToSecondDash(String domain) { final int firstDash = domain.indexOf('-'); if (firstDash < 0) return domain; //or null? exception? depends... final int secondDash = domain.indexOf('-', firstDash + 1); if (secondDash < 0) return domain; //or null? exception? depends... return domain.substring(0, secondDash); }
Not javay enough. Maybe a null check or two and a couple try catches too. Maybe a lamda too.
-
@DogsB said in Split string at second occurrence of "-" (Java):
@Kamil-Podlesak said in Split string at second occurrence of "-" (Java):
@topspin said in Split string at second occurrence of "-" (Java):
Some combination of
substring
andindexOf
? OrlastIndexOf
iif$service
contains no hyphens.
Wild-ass guess just from looking at the docs:domain.substring(0, domain.indexOf('-', domain.indexOf('-')))
Disclaimer: this post comes without any warranty of correctness or fitness for any particular purpose, including fence-post errors
Off-by-one error, you've forgot to add 1: domain.substring(0, domain.indexOf('-', domain.indexOf('-')+1))
Also, strings without two dashes should be handled:
String prefixToSecondDash(String domain) { final int firstDash = domain.indexOf('-'); if (firstDash < 0) return domain; //or null? exception? depends... final int secondDash = domain.indexOf('-', firstDash + 1); if (secondDash < 0) return domain; //or null? exception? depends... return domain.substring(0, secondDash); }
Not javay enough. Maybe a null check or two and a couple try catches too. Maybe a lamda too.
Actually, null check is not such a bad idea. As for the rest... no problem!
I have thrown in some function composition and currying as a bonus, for free!private static String prefixToSecondDash(String domain) { final BiFunction<String, Integer, Optional<Integer>> nextDash = (str, idx) -> Optional.of(str.indexOf('-', idx)).filter(x -> x >= 0); final BiFunction<String, Integer, Optional<Integer>> secondDash = nextDash.andThen(o -> o.map(x -> x + 1).flatMap(idx1 -> nextDash.apply(domain, idx1))); try { return Optional.ofNullable(domain).map(x -> x.substring(0, secondDash.apply(x, 0).orElse(x.length()))) .get(); } catch (NoSuchElementException e) { throw new IllegalArgumentException("Not enough dashes!", e); } }
-
@Kamil-Podlesak said in Split string at second occurrence of "-" (Java):
@DogsB said in Split string at second occurrence of "-" (Java):
@Kamil-Podlesak said in Split string at second occurrence of "-" (Java):
@topspin said in Split string at second occurrence of "-" (Java):
Some combination of
substring
andindexOf
? OrlastIndexOf
iif$service
contains no hyphens.
Wild-ass guess just from looking at the docs:domain.substring(0, domain.indexOf('-', domain.indexOf('-')))
Disclaimer: this post comes without any warranty of correctness or fitness for any particular purpose, including fence-post errors
Off-by-one error, you've forgot to add 1: domain.substring(0, domain.indexOf('-', domain.indexOf('-')+1))
Also, strings without two dashes should be handled:
String prefixToSecondDash(String domain) { final int firstDash = domain.indexOf('-'); if (firstDash < 0) return domain; //or null? exception? depends... final int secondDash = domain.indexOf('-', firstDash + 1); if (secondDash < 0) return domain; //or null? exception? depends... return domain.substring(0, secondDash); }
Not javay enough. Maybe a null check or two and a couple try catches too. Maybe a lamda too.
Actually, null check is not such a bad idea. As for the rest... no problem!
I have thrown in some function composition and currying as a bonus, for free!private static String prefixToSecondDash(String domain) { final BiFunction<String, Integer, Optional<Integer>> nextDash = (str, idx) -> Optional.of(str.indexOf('-', idx)).filter(x -> x >= 0); final BiFunction<String, Integer, Optional<Integer>> secondDash = nextDash.andThen(o -> o.map(x -> x + 1).flatMap(idx1 -> nextDash.apply(domain, idx1))); try { return Optional.ofNullable(domain).map(x -> x.substring(0, secondDash.apply(x, 0).orElse(x.length()))) .get(); } catch (NoSuchElementException e) { throw new IllegalArgumentException("Not enough dashes!", e); } }
The method should return an optional then PR approved.
-
I love how I messed up the closing tags, which affected every following quote, and no-one cared to fix it.
-
@topspin said in Split string at second occurrence of "-" (Java):
I love how I messed up the closing tags, which affected every following quote, and no-one cared to fix it.
The power of
-
@Benjamin-Hall you know the format exactly. Why not just make a regex capture?
That said, the regex in OP is completely wrong. Like, not even close. What you want is
^([A-Za-z]{2}-\d{2})-
. Or even just^([^-]+-[^-]+)-
if you don't actually care about the format veyond two dashes. Can be even easier if the service part has no dashes; then you just capture everything before last dash:^(.*)-[^-]+$
.Note that in many (but not all!) APIs, the zeroth capture group corresponds to the entire match of the regex, so make sure you correctly retrieve the first group, not zeroth.
As for deduplication, when your list is sorted then finding duplicates has linear complexity. Not useful in your case, but it's good to keep it in mind for the future.
-
@Gąska said in Split string at second occurrence of "-" (Java):
@Benjamin-Hall you know the format exactly. Why not just make a regex capture?
That said, the regex in OP is completely wrong. Like, not even close. What you want is
^([A-Za-z]{2}-\d{2})-
. Or even just^([^-]+-[^-]+)-
if you don't actually care about the format veyond two dashes. Can be even easier if the service part has no dashes; then you just capture everything before last dash:^(.*)-[^-]+$
.Yeah, it wasn't really supposed to be a regex. Just something vaguely suggestive of one. I could have captured using a regex. That's definitely correct.
As for deduplication, when your list is sorted then finding duplicates has linear complexity. Not useful in your case, but it's good to keep it in mind for the future.
I'm not sure how/if it's sorted, actually. It comes in the form of a nested JSON object with a bunch of other stuff I don't care about at all. It may be sorted alphabetically by subdomain, but I kinda doubt it. Although, now that I think about it, it very well might be.
Fortunately this is only needed (or called) for the sandbox versions of the app, and only once on startup if it doesn't know what sandbox to talk to. So it's not exactly hot path. And the total number of unique prefixes is generally small (3-10) as keeping sandboxes open is expensive (each one is 5 load balancers, 5-ish EC2 instances, and a bunch of other crap). So they tend to live for a day or so and get torn down (and set up again later if needed, that's all automated).
-
@Benjamin-Hall said in Split string at second occurrence of "-" (Java):
As for deduplication, when your list is sorted then finding duplicates has linear complexity. Not useful in your case, but it's good to keep it in mind for the future.
I'm not sure how/if it's sorted, actually.
Then sort it yourself. Deduplication by sorting is O(n log n). But again, not terribly useful in this particular case, as n is way too small for it to make sense.
-
@Gąska said in Split string at second occurrence of "-" (Java):
@Benjamin-Hall said in Split string at second occurrence of "-" (Java):
As for deduplication, when your list is sorted then finding duplicates has linear complexity. Not useful in your case, but it's good to keep it in mind for the future.
I'm not sure how/if it's sorted, actually.
Then sort it yourself. Deduplication by sorting is O(n log n). But again, not terribly useful in this particular case, as n is way too small for it to make sense.
I will note that the total array (before deduplication) is much larger. Each unique prefix occurs ~35 times, so there are roughly 100-500 elements in the array total.
-
@Benjamin-Hall best approach would be deduplicate-as-you-go. For each name, before adding the prefix to the list, do a linear search whether it's already there. So basically exactly what you'd do with a set, just with a flat list instead of a nice hashtable.
-
@Gąska doesn't Java have something similar to .Net's HashSet<T>? Just add to the set, no explicit searching needed. And if you need a list then just ToList at the end. May not be the most runtime efficient way to do it, but it will make very simple and readable code.
-
@robo2 Don't forget about this part of the OP's post:
@Benjamin-Hall said in Split string at second occurrence of "-" (Java):
Since I'm working on an ancient version of android-java, I don't have access to the set-based interfaces, so I'll have to deduplicate manually. That's doable, but N^2 (N should be small, however, < 10 most of the time, as these are ephemeral).
-
@robo2 said in Split string at second occurrence of "-" (Java):
@Gąska doesn't Java have something similar to .Net's HashSet<T>?
OP says this particular version of Java doesn't. And I believe it, as I worked with ancient versions of Java myself. Otherwise it'd be a no-brainer to use a set.
-
@Gąska said in Split string at second occurrence of "-" (Java):
@robo2 said in Split string at second occurrence of "-" (Java):
@Gąska doesn't Java have something similar to .Net's HashSet<T>?
OP says this particular version of Java doesn't. And I believe it, as I worked with ancient versions of Java myself. Otherwise it'd be a no-brainer to use a set.
Yeah. That was my first thought, but no. Minimum API version is 14.
ArraySet
isn't usable until target API 23.My plan for the dedup was basically like you said. Grab a prefix, check the list. If it's not there, add it.
-
@JBert said in Split string at second occurrence of "-" (Java):
@robo2 Don't forget about this part of the OP's post
That advice clearly came too late...
-
@Benjamin-Hall said in Split string at second occurrence of "-" (Java):
@Gąska said in Split string at second occurrence of "-" (Java):
@robo2 said in Split string at second occurrence of "-" (Java):
@Gąska doesn't Java have something similar to .Net's HashSet<T>?
OP says this particular version of Java doesn't. And I believe it, as I worked with ancient versions of Java myself. Otherwise it'd be a no-brainer to use a set.
Yeah. That was my first thought, but no. Minimum API version is 14.
ArraySet
isn't usable until target API 23.Are you two talking about? According to the documentation,
java.util.HashSet
is "Added in API level 1".Also, even if you don't have "set-based interfaces" (shouldn't that be "interface-based sets"?), you can use
Hashtable
or evenProperties
like a caveman.
-
@Benjamin-Hall said in Split string at second occurrence of "-" (Java):
@Gąska said in Split string at second occurrence of "-" (Java):
@robo2 said in Split string at second occurrence of "-" (Java):
@Gąska doesn't Java have something similar to .Net's HashSet<T>?
OP says this particular version of Java doesn't. And I believe it, as I worked with ancient versions of Java myself. Otherwise it'd be a no-brainer to use a set.
Yeah. That was my first thought, but no. Minimum API version is 14.
ArraySet
isn't usable until target API 23.My plan for the dedup was basically like you said. Grab a prefix, check the list. If it's not there, add it.
Use some other kind of set. HashSet? TreeSet? Doesn't really matter. You can convert it back to an ArrayList when you're done or whatever.
-
@Benjamin-Hall said in Split string at second occurrence of "-" (Java):
Yeah. That was my first thought, but no. Minimum API version is 14. ArraySet isn't usable until target API 23.
When stuck in old java versions, you can use kotlin. It has all the good stuff compiling to all the old platforms
-
@boomzilla said in Split string at second occurrence of "-" (Java):
@Benjamin-Hall said in Split string at second occurrence of "-" (Java):
@Gąska said in Split string at second occurrence of "-" (Java):
@robo2 said in Split string at second occurrence of "-" (Java):
@Gąska doesn't Java have something similar to .Net's HashSet<T>?
OP says this particular version of Java doesn't. And I believe it, as I worked with ancient versions of Java myself. Otherwise it'd be a no-brainer to use a set.
Yeah. That was my first thought, but no. Minimum API version is 14.
ArraySet
isn't usable until target API 23.My plan for the dedup was basically like you said. Grab a prefix, check the list. If it's not there, add it.
Use some other kind of set. HashSet? TreeSet? Doesn't really matter. You can convert it back to an ArrayList when you're done or whatever.
Shows that I don't know much Java. I just looked for the first reasonably-similar "Set" class.
-
Oops. Unhelpful troll self-Jeffed to the appropriate topic.
-
@HardwareGeek OP's problem is solved, trolling is okay now. But why do you think that telling people Java sucks is trolling? You're just stating the facts