1 month to train an intern
-
So the candidate #1 whom we took in for a month-long internship has settled in. Installed Windows. Installed Linux. Set the project up. Even updated some documentation. All great.
So now the time has come to try and make a real coder out of him. Gave him the link to [eloquent javasctipt] (http://eloquentjavascript.net/) and left him at it. A day later he's 4 chapters in, so I figured he's due for a little test, to measure his progress. And I had the perfect thing - a little function I actually needed for a real project.
/* Split domain into a top domain and subdomain parts. Returned as array. Eg: "a.b.mysite.com" => ["a.b", "mysite.com"] "mysite.com" => [null, "mysite.com"] "com" => [null, "com"] "" => [null, null] */ function splitDomain(domain) { }
I set him up with a quick repl.it page, including my standard ad-hoc testing harness and let him hack at it. Then I wrote the thing in 2 minutes and went on with my work.
An hour later, he comes back with this:
function splitDomain(domain) { var count=0; var split = []; var first_dot=0; for(var i=0; i<domain.length; i++){ if(domain[i] == "."){ count++; if(count==2) dot = i; } } if(count==3){ top_domain=(domain.slice(0,dot)); sub_domain=(domain.slice(dot+1)); } else if(count==1){ top_domain = null; sub_domain = domain; } else{ top_domain = null; if(domain.localeCompare("com")==0) sub_domain = null; else sub_domain = domain; } split.push(top_domain); split.push(sub_domain); console.log(split); }
Output:
[ 'a.b', 'mysite.com' ] [ null, 'a.mysite.com' ] [ null, 'mysite.com' ] [ null, 'mysite' ] [ null, '' ]
##WTF!?
Notice he's printing out results instead of returning them. So how come there are no ERROR messages?
function test(input, expected) { var out = splitDomain(input); if (deepEqual(out, expected)) { console.log(" OK: " + input + " => " + out); } else { //console.log("ERROR: " + input + " => " + out + " (expected: " + expected + ")"); } }
Yup, he disabled my "annoying" error output.
And WTF is this shit:
if(domain.localeCompare("com")==0)
I didn't even try to figure out what he's doing. I just reenabled my test function and added another test:test("a.b.c.d.e.f.g.mysite.net", ["a.b.c.d.e.f.g", "mysite.net"]);
2 failures. Back to the drawing board.
-
-
A day later he's 4 chapters in
Was he before or after the definition of the
split
method?
-
Why first edition and not second?
Mistake in the post. Google must have remembered the link from when I was learning js. He's given the proper 2nd edition.
Was he before or after the definition of the split method?
Exactly.
I explained the idea of programming against interface (question comes in, answer comes out) and gave him the link to the mdn page on the split method.
20 minutes later, he had this:
function splitDomain(domain){ var domain_array = domain.split("."); var sub_domain = domain_array.slice(domain_array.length-2,domain_array.length).join("."); var top_domain = domain_array.slice(0,domain_array.length-2).join("."); if(top_domain==="") top_domain = null; var split = []; split.push(top_domain,sub_domain); //console.log(split); return split; }
So... progress!
-
We've got this in our codebase, with some twists:
"a.b.mysite.com" => [ "a.b", "mysite", "com" ]
"a.b.mysite.co.uk" => [ "a.b", "mysite", "co.uk" ]Not spoiling ahead with how we tackled that.
-
We've got this in our codebase, with some twists:
"a.b.mysite.com" => [ "a.b", "mysite", "com" ]"a.b.mysite.co.uk" => [ "a.b", "mysite", "co.uk" ]
Not spoiling ahead with how we tackled that.
Create a giant hash of possible top domains, how else.
-
-
Got it in one. Couldn't think of anything better either. Though the hash only contains the one consisting of two parts (874 of them, at the time).
I need to kill that code it's unused.
-
Have
@cartman82 said:a giant
and from that createhashlist of possible top domains
@VinDuv said:A regular expression with lots of |’s
would have been my guess.
-
Did anyone consider domains like "(domain name).(generic TLD).(country TLD)", e.g. "example.net.de" (which are quite common outside 'mericuh, and there's virtually no difference between those cases and "subdomain.domain.tld" cases)?
Edit: thank you Dicksource and your WYWITNWYG editor! At least I don't have to escape quotes...
-
Misleading variable names FTW.
-
So, how are you handling the fact that there are now top-level domains under .uk?
-
Here's what I would do:
- Call this function.
-
You'd call a go program from the browser? Good luck with that.
-
"a.b.mysite.co.uk" => [ "a.b", "mysite", "co.uk" ]
That's probably going to break soon. .uk is going to (or they're certinaly trying to force out more money from us) because it's going to become like .com. (i.e. no required second level name).
You have example.co.uk? You're going to have to get example.uk before example.org.uk do. Unless example.gov.uk get it before both of you.
-
The hash is a whitelist. Since "company.uk" isn't in the whitelist, output is "www.company.uk" => [ "", "company", "uk" ]
There were top-level domains of that description when that code was written too.
-
And don't forget example.me.uk or the more esoteric things like example.plc.uk
-
Strange, I didn't include in the file which source I used. Must've been wikipedia or IANA or something. I've got 8 .uk domains:
.co.uk
.ltd.uk
.me.uk
.net.uk
.nic.uk
.org.uk
.plc.uk
.sch.ukAnything else would just get .uk in the TLD output.
I've also got 56 of them on .us. Seems to include one for each state.
EDIT: A short search returns a list at https://publicsuffix.org/list/effective_tld_names.dat . Don't think that's what I used at the time, but if anyone else is looking...
-
.gov.uk
-
-
prog.go:4: can't find import: "code.google.com/p/go.net/publicsuffix"
[process exited with non-zero status]Program exited.
Nope.
-
I'm surprised nobody's looked at the code yet.
-
Though the hash only contains the one consisting of two parts (874 of them, at the time).
Wouldn't it have been shorter to attack it the other way, i.e., compare against the non-country-specific TLDs like .com?
-
I'm surprised nobody's looked at the code yet.
Given how popular Go is around here, I can only assume that's because of willful blindness.
-
It's not about country TLDs. There are plenty which don't use generic second-level names. Also country TLDs are easy to identify: They're two letters.
The reason not to list TLDs is exactly what has already been discussed in this thread: For some TLDs, there may be both generic and registerable SLDs.
-
-
It's not about country TLDs. There are plenty which don't use generic second-level names. Also country TLDs are easy to identify: They're two letters.
I was assuming--possibly incorrectly--this was old code. Back in the 90s all the ccTLDs (or rather, all the TLDs I was aware of rather than .com, .net, and so on) I knew of were of the form .co.uk as opposed to .uk. Given that constraint, the handful of "American" TLDs is a shorter list, right?
-
I'm from the netherlands. .nl is the oldest ccTLD. It has only had second-level domains of the form .000.nl, .001.nl, etc. for a short while to facilitate personal domains, though I am unsure whether that was ever put into practice. For all practical purposes it does not have a second level.
-
For all practical purposes it does not have a second level.
Somewhere a long time ago--probably all the way back into the late 80s--I got it into my head that the the non-ccTLDs all did, and I never noticed anything different strongly enough to change that impression.
-
A short search returns a list at https://publicsuffix.org/list/effective_tld_names.dat
That's the only correct solution. It's curated so as long as you refetch it from time to time, you won't be too far behind. And yes, it does change.
(UK domains have been done like
.co.uk
because the names are a hold-over from the pre-internet Coloured Book system — except for the elements being listed in reverse order. When the UK switched to using the far-superior Internet, the names were retained except for the reversal. I'm not sure that the internet-order domain names are better, but virtually everything else was so swapping the names around was the least WTFy way to do it. Coloured Book networking was horrible.)
-
You'd call a go program from the browser? Good luck with that.
This is @ben_lubar we're talking about. He's in a three-way relationship with go and Dwarf Fortress.
-
He's in a three-way relationship with go and Dwarf Fortress.
Imagine if DF were written in Go? We'd either never hear from him again or else we'd never hear about anything else.
-
@ben_lubar: new project for you! Write a DF mod in Go!
-
@Ben_lubar should totally do that!
-
I probably shouldn't be publicising this yet, but I'm making a multiplayer platformer.
-
And you still have time for TDWTF?! You aren't working hard enough!
-
@ben_lubar: new project for you! Write a DF mod in Go!
No, no, no, he should port DF to Go.
-
No screenshots or description? Lame.
-
No, no, no, he should port DF to Go.
I apologize for not setting the bar high enough. I shall go sit in the corner until my punishment is decided.
-
I shall go sit in the corner until my punishment is decided.
Your punishment is to play DF, or ben's game, which I assume will be just as much fun.
-
Your punishment is to play DF, or ben's game, which I assume will be just as much fun.
-
It's still mostly in the concept stage.
-
Concept + exe stage?
-
It doesn't cost that much to compile a program these days.
-
It doesn't cost that much to compile a program these days.
In a thread where DF is in-context, wouldn't it be more idiomatic for you to use a punchcard-based compiler?
-
So how many viruses am I downloading right now?
-
Well, I don't know the answer to that, but you are downloading a partially-implemented platformer with no maps.
-
Yeah I tried it, it don't do jack. Gave some noise about how to run it as a server and set a port.
Now to clean all the viruses.
-
Why were you downloading viruses? I didn't even tell you to download my
, let alone viruses.game
-
Well, whatever it was, it didn't do jack.