Enter the Monorepo

dkf

This sounds like a management fuckup caused by management thinking prorgammers are interchangeable, as is unfortunately way too common.

That's OK. Management all the way up to CEO and Chairman level tend to be interchangeable too. It's amazing what slime moulds can do.

boomzilla

@Bulb said in Enter the Monorepo:

But it is useful to have some overview of what others are working for so you can coordinate changes to the same functionality before you create too big conflicts and lose time on resolving the,

We're pretty much all remote since COVID (and probably already ~75% remote before) so this also just keeps everyone in touch with each other. I'm talking with other devs all the time, but not necessarily with the ops guys, so this is good from a team perspective for us. Not a huge team...there are...uh...less than 15 of us total.

For years this was just phone but since the company moved from Skype to Teams we've been doing video call starting this past January.

Some people go on and on with their statuses, which is of course super annoying but most people are pretty good. It's very common to have some people stay on afterwards to continue some discussion in detail.

dkf

@boomzilla said in Enter the Monorepo:

For years this was just phone but since the company moved from Skype to Teams we've been doing video call starting this past January.

We've been at least partially remote for quite a few years now (for complicated reasons originally), first with the phone, then Webex, then Skype, then Teams, then finally Zoom. Directly using a phone is miserable unless you have a fancy headset (which nobody could ever seem to authorize the expense for). Webex was pretty good for its time, but dreadfully expensive. Skype was easy to use, but not very good at scaling up. Teams doesn't seem to work great unless everyone's using Windows (which the team never has done). Zoom seems to work well.

All assuming you have the processing power and bandwidth, of course. Those armed with potatoes and tin cans for their comms infrastructure can go suck it.

boomzilla

@dkf said in Enter the Monorepo:

Directly using a phone is miserable unless you have a fancy headset (which nobody could ever seem to authorize the expense for).

I got a google number and got a headset for my computer...largely because my wife didn't want to hear my calls on speakerphone any more.

Teams doesn't seem to work great unless everyone's using Windows (which the team never has done)

We have a few people on Macs and that seems to work just fine. I pretty much always have a windows and a linux teams client running (the linux client inside my guest vm, of course). I mostly use the linux client for typed messages, because I'm normally working in the vm but the windows client for voice / video, largely due to of sharing the hardware with the vm.

dkf

@boomzilla said in Enter the Monorepo:

We have a few people on Macs and that seems to work just fine.

Hmm, I wonder what was going wrong then. I don't care to investigate, of course.

boomzilla

@dkf could be that they've improved the Mac offering. As I said, we didn't start using it until the beginning of this year.

Steve_The_Cynic

@Bulb said in Enter the Monorepo:

[…]
It gets worse, because one of our modules was developed in site A (where I work) by a bunch of site A developers (including a certain @Steve_The_Cynic that posts here), but is now theoretically handled by developers at site B. They know a certain amount about how a specific subset of the module works, but there are lots of other things it does, and the only people who know about those parts work at site A. I get lots of questions from them when they have to venture down into the dusty corners down below that I or this guy or that guy wrote (good code, but complicated because the subjects are complicated).

This sounds like a management fuckup caused by management thinking prorgammers are interchangeable, as is unfortunately way too common.

It's not exactly that, but dangerously close to it. The people making the "which teams work on what" decisions aren't really aware of just how much deeply tangled code is under the surface of that module, and how close that deeply tangled code really is. (One of them should be more so than he appears to be, because he's been at the company for five or six years longer than my thirteen years here (how time flies when you're having fun...).)

It is rather orthogonal to the repository organization.

It is, but...

The component is managed as a separate subproject just fine, the management just neglected to keep enough people with domain and code knowledge on the team, so now the people supposed to do the work have to keep asking the people who know how to do it, but are supposed to be doing something else.

It has always been somewhat a separate subproject. It's a large kernel module with a horribly tangled kernel<=>userland interface (like Topsy, it just sorta growed, but that growth wasn't well managed, either at the beginning or later on).

For me, it's complicated by my job being a sort of technical conscience as well as (?instead of?) being just a very experienced developer, which means that "help out any and all other teams with their problems" is explicitly part of my current job description.

But the point about a misguided perception of interchangeablilty of developers is well-taken, and definitely relevant. There is a copy of the 25th anniversary edition of The Mythical Man-Month on my desk, as a rather pointed but largely overlooked comment on ... stuff.

Linux has particularly fast stat and readdir, but on Windows vast majority of time taken by status is listing the files and checking their timestamps. Which obviously affects all version control systems.

Are FindFirstFile and FindNextFile really that slow? (Noteworthy: they combine readdir and stat into single calls.)

They are not that bad, though they are still quite a bit slower than Linux getdents. But often the problem is compounded by the version control system using some portability layer that ends up calling GetFileAttributes separately and that is really slow.

Yeah, that's going to suck. Sounds like the original code wasn't factored correctly before the portability shim was put under it, or that the shim is too shallow.

For those in the audience who don't know:

FindFirstFile and FindNextFile are approximately the equivalent of calling opendir/readdir on a UNIX type system.
Except that in addition to the name of the directory entry, they also return similar metadata to what stat would return, which obviates the need to make a separate all to GetFileAttributes.
So think of them as readdir that also calls stat for "free" (and if not free, then much more cheaply than an explicit call).

PleegWat

@Steve_The_Cynic said in Enter the Monorepo:

misguided perception of interchangeablilty of developers

Don't I know it. The boss² refuses to accept that on-boarding a new developer in a team might take more than 2 weeks, even as we've been working on getting on-boarded for 2 months now and I still don't even have the solution running in my dev environment.

Arantor

I’m wondering if source control organisation is a natural result of Conway’s Law or if Conway’s Law is the cause of it, c.f. repos organised by team structure.

Parody

@dkf said in Enter the Monorepo:

@Parody said in Enter the Monorepo:

We didn't do a ton of code reviews. The project leads thought we should do more but time was always at a premium.

The project leads are right... and it is up to them to provide the resource to get you out of fire-fighting mode.

Sadly, their higher-ups didn't give them many resources to work with. On the plus side, we hardly ever had crunch time. :)

Parody

@Parody said in Enter the Monorepo:

Use a centralized source control system instead?

It wouldn't actually help. The git backing store is not the bottle-neck, the working directory is.

It sounded to me like the problem was doing scans of the directory. This is the directory where you've already cut out those extra files (by limiting your view), you've been eliminating some scans by telling the server what you want to change (checking out/locking files), and where some operations may not even happen on your machine, but up on the server.

Don't get me wrong, there's plenty of limitations involved starting with needing that connection to the server. It just really sounds to me like they solved their problem by pushing back towards doing things in the older style.

Bulb

@Parody There is no server here. Even subversion does status fully locally. And since status is also looking for new files, there is no avoiding scanning the directory. Plus the whole point that reading a directory is probably faster than looking up a bunch of specific files in it anyway, because most of the time the system has to read the list of files anyway and then it just does a linear reading instead of doing a lookup again and again.

Hm, reminds me, there is also some quirk on Windows with saying you don't want the short names (there is a flag for the FindFirstFileEx for that purpose since Windows 7). The short names are implemented rather inefficiently, so it speeds things up quite a lot when you have large directories.

Steve_The_Cynic

@Bulb said in Enter the Monorepo:

Hm, reminds me, there is also some quirk on Windows with saying you don't want the short names (there is a flag for the FindFirstFileEx for that purpose since Windows 7). The short names are implemented rather inefficiently, so it speeds things up quite a lot when you have large directories.

Curiously, that is the one thing that FAT filesystems do better than NTFS, since the short name is the name of the file, and the directory entries that contain the long name immediately follow the entry that contains the short name.

topspin

@Bulb said in Enter the Monorepo:

@Parody There is no server here. Even subversion does status fully locally. And since status is also looking for new files, there is no avoiding scanning the directory. Plus the whole point that reading a directory is probably faster than looking up a bunch of specific files in it anyway, because most of the time the system has to read the list of files anyway and then it just does a linear reading instead of doing a lookup again and again.

Hm, reminds me, there is also some quirk on Windows with saying you don't want the short names (there is a flag for the FindFirstFileEx for that purpose since Windows 7). The short names are implemented rather inefficiently, so it speeds things up quite a lot when you have large directories.

It's 2022. As far as I know, Windows 3.x / DOS have both been EOL for awhile and 16 bit apps aren't supported anymore.
WhyTF are short names still a thing?!

aitap

@topspin said in Enter the Monorepo:

WhyTF are short names still a thing?!

Say, you want to launch a child process on Windows and pass it a path to a file. CreateProcess only accepts one command line that the child is supposed to parse (and maybe split into char ** argv, maybe do something completely different). The rules are somewhat different depending on the runtime that the child is using. For years, your software has been relying on the short name containing no spaces instead of quoting the path. Enter Windows 10, which sometimes comes installed on volumes with short names disabled. Boom! Now it doesn't work.

Arantor

Backwards compatibility is a very dirty double edged sword.

Bulb

@aitap It might also be using short names to evade the 260 or so character limit on paths. I've done that a few times in the past (in scripts, using cygpath -d).

Steve_The_Cynic

@Bulb said in Enter the Monorepo:

@aitap It might also be using short names to evade the 260 or so character limit on paths. I've done that a few times in the past (in scripts, using cygpath -d).

That limit isn't real, but requires a bit of programming to overcome, and in certain degenrate cases, the short name can be longer than the long name anyway.(1)

Basically, you have to use the wide-char version of the API in question. I'll take CreateFile as an example, but the same applies to the others.

We make explicit use of the wide-char version of the function, CreateFileW.
We pass it a completely-parsed filename, ready to be used No "." or ".." directory names, absolute path, all slashes are backslashes with no forward slashes, no double slashes except at the beginning of a UNC path, etc.
We prefix this name with \\.\. If the complete path after the prefix is a UNC path, we do, indeed, end up with three backslashes in a row.

In this case, the limit is 32767 characters.

(1) Taking the example, on FAT (12, 16 or 32, doesn't matter which), of a.html, we find that the long name is six characters long, but the short name is no shorter than A~1.HTM, which is already seven characters, and could be longer if it is created after the ~1, ~2, ..., ~9 versions. .a is even worse, just two characters that generate a short name that is at least as long as A~1.

Steve_The_Cynic

@Arantor said in Enter the Monorepo:

Backwards compatibility is a very dirty double edged sword.

And both of those edges extend all the way to the other end of the hilt. Be very careful how you hold that sword.

Steve_The_Cynic

@topspin said in Enter the Monorepo:

It's 2022. As far as I know, Windows 3.x / DOS have both been EOL for awhile and 16 bit apps aren't supported anymore.

16-bit Windows apps have only ceased being supported with the arrival of Windows 11. Windows 10 had 32-bit builds where the 64-bit-inspired "no 16-bit" thing didn't apply.

Bulb

@Steve_The_Cynic said in Enter the Monorepo:

That limit isn't real, but requires a bit of programming to overcome

The problem happens when it would need someone else to do that programming, because you are just passing a path to some other tool and that tool doesn't do that.

@Steve_The_Cynic said in Enter the Monorepo:

We pass it a completely-parsed filename, ready to be used No "." or ".." directory names, absolute path, all slashes are backslashes with no forward slashes, no double slashes except at the beginning of a UNC path, etc.

Which means you have to do a helluva lot things that the filesystem driver is supposed to be doing for you.

Can you even canonicalize the path as text without reading each element? I know you can't on Unix, because if you have a symlink (and those are a thing on NTFS) and then .., it points to the real parent of the target, but I am not sure whether windows do that too or just take the drive current path and normalize as string. If the later you can replicate it, but it's still .

Also I can't really imagine which application relies on CreateFile rejecting a longer argument so they couldn't just raise the limit and call it a fucking fortnight.

Bulb

@aitap said in Enter the Monorepo:

@topspin said in Enter the Monorepo:

WhyTF are short names still a thing?!

Say, you want to launch a child process on Windows and pass it a path to a file. CreateProcess only accepts one command line that the child is supposed to parse (and maybe split into char ** argv, maybe do something completely different). The rules are somewhat different depending on the runtime that the child is using. For years, your software has been relying on the short name containing no spaces instead of quoting the path.

… like (GNU) Make. Yeah, that's the other thing I used cygpath -d for. Actually, IIRC with make I used it both because I wouldn't have problems with spaces (which GNU Make can't be made to handle) and because the assembling of paths resulted in exceeding the command-line length limits as the -I flags accumulated.

robo2

@Bulb Afaik they did call it a fortnight, although it requires you to set a value in the registry (it is still not the default ). See https://docs.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation?tabs=cmd#enable-long-paths-in-windows-10-version-1607-and-later

Steve_The_Cynic

@Bulb said in Enter the Monorepo:

@Steve_The_Cynic said in Enter the Monorepo:

That limit isn't real, but requires a bit of programming to overcome

The problem happens when it would need someone else to do that programming, because you are just passing a path to some other tool and that tool doesn't do that.

Indeed. I didn't say who had to do the programming...

@Steve_The_Cynic said in Enter the Monorepo:

We pass it a completely-parsed filename, ready to be used No "." or ".." directory names, absolute path, all slashes are backslashes with no forward slashes, no double slashes except at the beginning of a UNC path, etc.

Which means you have to do a helluva lot things that the filesystem driver is supposed to be doing for you.

Not the filesystem driver, but the system file-path parser, since it's (very) approximately independent of the filesystem.

Can you even canonicalize the path as text without reading each element? I know you can't on Unix, because if you have a symlink (and those are a thing on NTFS) and then .., it points to the real parent of the target, but I am not sure whether windows do that too or just take the drive current path and normalize as string. If the later you can replicate it, but it's still .

It's ... fraught, and in general I'd avoid doing it. Where it really gets necessary is when someone shares C:\shortpath and uses shortpath as the share name, on a machine whose network name is already long. In such a case, you can get a full path that's valid (by the "260" thing) for local applications on that machine, but can't be accessed via the "260 thing" over the network.

Also I can't really imagine which application relies on CreateFile rejecting a longer argument so they couldn't just raise the limit and call it a fucking fortnight.

The problem isn't, as such, CreateFile (where the path name is input), but rather things like GetFullPathName or GetCurrentDirectory, where applications expect the limit to be that "260 thing" and the path name is output.

Bulb

@Steve_The_Cynic said in Enter the Monorepo:

The problem isn't, as such, CreateFile (where the path name is input), but rather things like GetFullPathName or GetCurrentDirectory, where applications expect the limit to be that "260 thing" and the path name is output.

… except:

Those functions are used much less often. Many programs just open whatever path they are given and that's it. So some applications would have just started to work if they simply relaxed the limit.
Those functions do take the buffer length as argument, so if they relaxed the limit, some applications would have just started to work.

Steve_The_Cynic

@Bulb said in Enter the Monorepo:

@Steve_The_Cynic said in Enter the Monorepo:

The problem isn't, as such, CreateFile (where the path name is input), but rather things like GetFullPathName or GetCurrentDirectory, where applications expect the limit to be that "260 thing" and the path name is output.

… except:

Those functions are used much less often. Many programs just open whatever path they are given and that's it. So some applications would have just started to work if they simply relaxed the limit.

Those functions do take the buffer length as argument, so if they relaxed the limit, some applications would have just started to work.

Agreed, except that the current versions of the documentation don't say what happens if you pass 32767 as your buffer size (with a buffer that's actually that big, of course) to GetCurrentDirectoryA (say) when the current directory requires more than MAX_PATH (260) characters. It's doubly a problem for GetCurrentDirectoryA if the actual directory path is longer than MAX_PATH and the function works as one would hope, since you can't pass that string back to SetCurrentDirectoryA to restore the current directory after manipulating it in some way. Even for GetCurrentDirectoryW and SetCurrentDirectoryW, it won't work if you just blindly do it(1) unless you also do what's described in https://docs.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation?tabs=cmd .

(1) That is, without prefixing the path with \\.\.

Bulb

@Steve_The_Cynic The thing they ended up doing still does not seem like it solved any issue better, just created a lot of extra work for developers.

HardwareGeek

@Bulb said in Enter the Monorepo:

@Steve_The_Cynic The thing they ended up doing still does not seem like it solved any issue better, just created a lot of extra work for developers.

Welcome to the world of Microsoft.

TheCPUWizard

Statistically, I am a big fan of tiny focused repositories and loosely coupled relationships [think along the line of micro-services, but not specifically microservices]. I have worked on mono-repos and with very few exceptions found the problems to significantly outweigh any perceived benefits.

TheCPUWizard

@topspin said in Enter the Monorepo:

WhyTF are short names still a thing?!

Why are Short people still a thing??? Thought Randy took care of that decades ago....

dkf

@Bulb said in Enter the Monorepo:

spaces (which GNU Make can't be made to handle)

That's mostly not a problem if you're very careful. There are a few tools where it is totally impossible. (I remember the resource compiler was one of those, because it internally didn't handle spaces well at all.)

the assembling of paths resulted in exceeding the command-line length limits as the -I flags accumulated.

That's completely fatal. The fix used in much of the Java toolchain is to allow arguments to be passed in a file, with one argument per line (because newlines are a lot less common in filenames and it's pretty practical). The file itself is passed as @thefile.txt. It's not a core feature of the JRE, just something that the build-time tools have as a convention. Apparently gcc can do the same thing (TIL).

Bulb

@dkf said in Enter the Monorepo:

@Bulb said in Enter the Monorepo:

spaces (which GNU Make can't be made to handle)

That's mostly not a problem if you're very careful. There are a few tools where it is totally impossible. (I remember the resource compiler was one of those, because it internally didn't handle spaces well at all.)

There are few tools where it is totally impossible, but the one I mentioned is among them.

It's no longer a problem now that Android native build can use CMake or Gradle, but it was very much a problem when all they had was that GNU Make-based monstrosity (back then I ended up cobbling together some CMake support for our project, because all the other platforms did work with that).

the assembling of paths resulted in exceeding the command-line length limits as the -I flags accumulated.

That's completely fatal. The fix used in much of the Java toolchain is to allow arguments to be passed in a file, with one argument per line (because newlines are a lot less common in filenames and it's pretty practical). The file itself is passed as @thefile.txt. It's not a core feature of the JRE, just something that the build-time tools have as a convention. Apparently gcc can do the same thing (TIL).

In my case the Android make-based monstrosity was generating insanely long paths by going into the toolchain and back and in an back again. Something along the lines of -Isomethingalreadylong/toolchain/arm-linux-android-eabi/usr/lib/gcc/arm-linux-android-eabi/../../../../../platform/… So I worked that around by monkey-patching some function to pass the paths through cygpath -da to generate canonical absolute short paths instead and that managed to get the thing to work.

When I made it to work with cmake instead, that used absolute paths on its own. They've probably hit the same problem earlier and decided absolute paths are less pain the a*.

dkf

@Bulb said in Enter the Monorepo:

When I made it to work with cmake instead, that used absolute paths on its own

Long experience with writing programs in various languages suggests that relative paths are OK as long as they're things directly passed in by a user on the command line (and you never call chdir()). Otherwise it's just so much easier to make them absolute as soon as possible, if you can.

That general rule also applies to build tools. It would all be easy everywhere if it wasn't for:

Path length limits.
Dumb quoting.
Command line length limits.

You know your tooling is doing something very wrong if you're hitting the command line length limit on Linux (except if you're feeding in masses of files, when you should use xargs).

Bulb

@dkf said in Enter the Monorepo:

easier to make them absolute as soon as possible

Usually. Unless something throws a pitchfork in the working by some Clever™ use of symlinks.

… there is also option to keep the directories various paths are supposed to be relative to open and use openat, but I've never seen anybody actually do it.

dkf

@Bulb said in Enter the Monorepo:

Unless something throws a pitchfork in the working by some Clever™ use of symlinks.

Resolving files to absolute should usually handle that (and that's why it is a fairly expensive operation, potentially). The exception comes if someone is Very Crafty™ and changes symlinks under the program's feet, but that sort of thing sabotages a great many things anyway.

use openat, but I've never seen anybody actually do it.

I think the main interest anyone would have there would be due to it being able to simulate a per-thread CWD. It's just a shame that that won't work for every system library call that takes a filename (e.g., to the dynamic library loader).

Bulb

@dkf said in Enter the Monorepo:

@Bulb said in Enter the Monorepo:

Unless something throws a pitchfork in the working by some Clever™ use of symlinks.

Resolving files to absolute should usually handle that (and that's why it is a fairly expensive operation, potentially). The exception comes if someone is Very Crafty™ and changes symlinks under the program's feet, but that sort of thing sabotages a great many things anyway.

The problem I have in mind is if someone derives one path from another. Like getting the path to the compiler binary and then deriving the path to the libraries by removing two components (bin/gcc) and replacing them with lib. If you have a …platform/corund/toolchain/bin linking to /opt/cross-monkey/arm-gcc-7.3/bin and …platform/corund/toolchain/lib linking to /var/lib/cross/corund/sysroot, things work, but only as long as you don't resolve the symlinks. Ptxdist, which we used in previous project, has a couple of layers of such links (and some wrappers around the tools to do some extra option fixups).

PleegWat

@dkf said in Enter the Monorepo:

Otherwise it's just so much easier to make them absolute as soon as possible, if you can.

I had to go the other way, and make absolutely sure all references were relative. As I recall, this was because the exact paths you specify end up in the binary's debug information.

@dkf said in Enter the Monorepo:

except if you're feeding in masses of files, when you should use xargs

I generally use find ... -exec somecommand {} +. This works similar to find ... -print0 | xargs -0 somecommand but is less fiddly to type. Naturally, there are also cases where the filenames do not originate in find.

@Bulb said in Enter the Monorepo:

… there is also option to keep the directories various paths are supposed to be relative to open and use openat, but I've never seen anybody actually do it.

I did that when our source code scanner complained about path name buffer overflow. It was less work than dynamically allocating the buffer every time I needed to format a file path.

Gribnit

@Arantor said in Enter the Monorepo:

I’m wondering if source control organisation is a natural result of Conway’s Law or if Conway’s Law is the cause of it, c.f. repos organised by team structure.

Seems to be the direct corollary for the sccs domain

Gribnit

@TheCPUWizard said in Enter the Monorepo:

@topspin said in Enter the Monorepo:

WhyTF are short names still a thing?!

Why are Short people still a thing??? Thought Randy took care of that decades ago....

Flag-hats came up as a solution, so we started working on that instead. Should be rolling out soon. At least it'll be harder to run right into them.

Steve_The_Cynic

@Bulb For sure. It's an unholy mess no matter how you look at it.

Bulb

@PleegWat said in Enter the Monorepo:

@dkf said in Enter the Monorepo:

Otherwise it's just so much easier to make them absolute as soon as possible, if you can.

I had to go the other way, and make absolutely sure all references were relative. As I recall, this was because the exact paths you specify end up in the binary's debug information.

I believe there is some well-hidden option to remove prefix from the recorded paths. At least bitbake somehow managed to get rid of the insane prefixes it generated with it's hardlink sysroots.

Steve_The_Cynic

@PleegWat said in Enter the Monorepo:

I generally use find ... -exec somecommand {} +. This works similar to find ... -print0 | xargs -0 somecommand but is less fiddly to type. Naturally, there are also cases where the filenames do not originate in find.

It's actually substantially slower than the find | xargs version, since find -exec invokes the command once for each filesystem object(1) that matches the conditions in the ..., while the xargs version bundles them up and invokes it once per group of filesystem objects. Well, unless you throw in -1 as well.

(1) Beware of fifos, the detritus from open or closed UNIX sockets, device special files, directories, symbolic links and so on, especially if the command only works on true files.

dkf

@Steve_The_Cynic said in Enter the Monorepo:

It's actually substantially slower than the find | xargs version, since find -exec invokes the command once for each filesystem object(1) that matches the conditions in the ...

That's what the + is for.

Bulb

This post is deleted!

Steve_The_Cynic

@dkf OK. I don't have the man page in front of me, and the last time I looked specifically at the syntax of the -exec action of find was probably ... (thinks) ... um, ... a long time ago. Like 1996 or so.

PleegWat

@dkf said in Enter the Monorepo:

@Steve_The_Cynic said in Enter the Monorepo:

It's actually substantially slower than the find | xargs version, since find -exec invokes the command once for each filesystem object(1) that matches the conditions in the ...

That's what the + is for.

Exactly. It does not support everything xargs does, but it handles the most common case.

Also good to know is find -delete, if that is the action you want to perform.

dkf

@Steve_The_Cynic I use xargs because that's what I learned, but I can remember the +; it's part of the set of things I can read but don't remember when it comes time to use them.

Bulb

@dkf The big advantage of find -exec is that it knows where the filenames begin and end. The find -print0 and xargs -0 are GNU extensions, which means you can't use them when your script is supposed to run on MacOS (or any other BSD derivative), some embedded systems using busybox or even that house of cards of a system the Chinese department cobbled together from AOSP the other day that uses “toybox”. And then you are relying on the filenames not containing newlines, and have to be careful to at least handle spaces correctly, because the filenames definitely do contain those. At that point, find -exec becomes easier.

But in general I usually give up and not try to support newlines in filenames. Spaces and other weird characters are doable (but note that xargs defaults to splitting on any whitespace), but newlines are too much of a pain. Too many places don't have a suitable option for nul-separation.

dkf

@Bulb said in Enter the Monorepo:

newlines are too much of a pain

They're as nothing to terminal control sequences. Because you want your file listings to scroll messages in your window title bar. (I've seen that done and it was an impressive/ghastly hack.)

Bulb

@dkf Terminal control sequences are generally not a problem for scripts though, because they don't get interpreted in the middle of a pipeline. But many things generating some input can only use newline as a separator.