What the Daily WTF?

Check if Win32 is internally duplicating the handle when it launches the child process. If yes, you should be able to CloseHandle() the read handle just after the CreateProcess() in the parent process (the child process will still have a reference to the handle). Maybe that will cause ReadFile() to return the appropriate status once the child process exits.

@anotherusername said in Reading stdout with WinAPI:

Have you tried putting CloseHandle(stdoutWriteHandle) right where you close the thread handle (don't know if it matters which you do first)?

Ok, so testing that now it does seem to read all of the input but then the last ReadFile() errors out with ERROR_BROKEN_PIPE. But I guess that may be fine? I mean it makes sense: after being closed on the parent's end the only thing keeping a handle to that pipe is the child processes so once it ends, the pipe will be closed. It just doesn't feel right.

@deadfast you can use PeekNamedPipe to check if input is available. Yes, it works with anonymous pipes too. Your loop should first check if the process is alive, then read input if available, and then, if the process is dead, wrap up and return. It's important to check process state before input state, but check input before deciding to return.

Hm, I actually tried that before and it does work:

for (;;)
{
  // Check if there is anything in the pipe.
  DWORD dataSize;
  success = PeekNamedPipe(stdoutReadHandle, nullptr, 0, nullptr, &dataSize, nullptr);
  if (!success)
  {
    return false;
  }

  if (dataSize == 0)
  {
    // If there is no data in the pipe check if the process has ended.
    GetExitCodeProcess(piProcInfo.hProcess, &exitCode);

    if (exitCode == STILL_ACTIVE)
    {
      // There should still be more output so try again.
      continue;
    }
    else
    {
      // We're done.
      break;
    }
  }

  // Read the data out of the pipe.
  CHAR buffer[4096] = { 0 };
  success = ReadFile(stdoutReadHandle, buffer, sizeof(buffer) - 1, &dataSize, NULL);
  if (!success)
  {
    return false;
  }
  out << buffer;
}

It just feels dodgy as well though.

Gąska

@deadfast it's not dodgy, just low-level. Non-blocking IO always looks awful.

Also, you have potential race condition when the child process writes to pipe just before it exits - if it happens after you checked pipe status but before you checked process status, you lose the final portion of data.

Deadfast

Also, you have potential race condition when the child process writes to pipe just before it exits - if it happens after you checked pipe status but before you checked process status, you lose the final portion of data.

Good point. Any ideas on how to address that?

Gąska

Ok, so testing that now it does seem to read all of the input but then the last ReadFile() errors out with ERROR_BROKEN_PIPE. But I guess that may be fine?

It is. Broken pipe literally means that it was closed on the other end and there is no more data. This is how you know the other process wrote all its output and ended.

This might be a better solution than mine. It's just that the documentation is fairly bad at explaining what handle inheritance means exactly - apparently it means the child process gets duplicate handle that can be closed separately, which is a good thing because this is exactly what we'd like. Or it might be something that looks like the handle is duplicate and closing it is actually a bad idea.

Gąska

@gąska said in Reading stdout with WinAPI:

Also, you have potential race condition when the child process writes to pipe just before it exits - if it happens after you checked pipe status but before you checked process status, you lose the final portion of data.

Good point. Any ideas on how to address that?

Do exactly as I told you to do in my first post.

cvi

But I guess that may be fine?

I think so. Not sure if you can get anything better.

It just feels dodgy as well though.

It is a bit dodgy IMO. Besides the race that @Gąska points out, you essentially end up in a tight polling loop when the pipe is empty and the process is still alive. Even if you fix the race, I would recommend avoiding this solution due to the polling loop.

Deadfast

Do exactly as I told you to do in my first post.

Heh, I thought I did... Attempt #2:

DWORD exitCode, dataSize;
do
{
  // Check if the process is alive.
  GetExitCodeProcess(piProcInfo.hProcess, &exitCode);

  // Check if there is anything in the pipe.
  success = PeekNamedPipe(stdoutReadHandle, nullptr, 0, nullptr, &dataSize, nullptr);
  if (!success)
  {
    return false;
  }

  if (dataSize > 0)
  {
    // Read the data out of the pipe.
    CHAR buffer[4096] = { 0 };
    success = ReadFile(stdoutReadHandle, buffer, sizeof(buffer) - 1, &dataSize, nullptr);
    if (!success)
    {
      return false;
    }
    out << buffer;
  }
} while (exitCode == STILL_ACTIVE || dataSize != 0);

It is a bit dodgy IMO. Besides the race that @Gąska points out, you essentially end up in a tight polling loop when the pipe is empty and the process is still alive. Even if you fix the race, I would recommend avoiding this solution due to the polling loop.

I don't really see the busy wait as being an issue. The child process normally only runs for a few milliseconds, it just spams a lot of data onto stdout at once that I have to deal with. I guess I could add a yield as well.

cvi

I don't really see the busy wait as being an issue. The child process normally only runs for a few milliseconds, it just spams a lot of data onto stdout at once that I have to deal with. I guess I could add a yield as well.

It's your program, so .

I consider busy waits/polling loops to be really icky code smell (for code that is running on a multitasking system), and would reject such in a code review with extreme prejudice. Even with a yield thrown into the mix. IMO, it's just almost always bad code. (Even here ... for what? To avoid getting a ERROR_BROKEN_PIPE?)

Gąska

@deadfast you shouldn't use returned data size as the amount you query - you should always use buffer size. You can always read the rest later. Also, the unnecessary zeroing of buffer irks me a bit. But otherwise looks good. But I agree with @cvi that busyloop isn't the best solution.

After some more googling, I finally found the piece of documentation that says yes, you should indeed close the pipe handle in parent process after the child inherits it. So, the best way to go is stick with @anotherusername's solution and wait until you get broken pipe.

cvi

you shouldn't use returned data size as the amount you query - you should always use buffer size.

AFAIK he does that (sizeof(buffer)-1). The &dataSize parameter passed to ReadFile is lpNumberOfBytesRead, i.e. ReadFile will store the number of bytes actually read in it.

You can always read the rest later.

Yep. The current version may lose data if the pipe contains more than sizeof(buffer)-1 elements when the the child process exited.

Gąska

@gąska said in Reading stdout with WinAPI:

you shouldn't use returned data size as the amount you query - you should always use buffer size.

AFAIK he does that (sizeof(buffer)-1). The &dataSize parameter passed to ReadFile is lpNumberOfBytesRead, i.e. ReadFile will store the number of bytes actually read in it.

Oh, right. That's what happens when you have bazillion arguments in a function.

You can always read the rest later.

Yep. The current version may lose data if the pipe contains more than sizeof(buffer)-1 elements when the the child process exited.

No, it will loop around one more time and read remaining input.

Deadfast

After some more googling, I finally found the piece of documentation that says yes, you should indeed close the pipe handle in parent process after the child inherits it. So, the best way to go is stick with @anotherusername's solution and wait until you get broken pipe.

Unless I'm missing something that page doesn't talk about broken pipes. Quite the opposite, it says that you should be getting a zero bytes read:

The parent process uses the ReadFile function to receive input from the pipe. [..] When all write handles to the pipe are closed, the ReadFile function returns zero.

This is exactly what I was expecting too but it doesn't seem to be happening.

Deadfast

@deadfast Actually, reading the ReadFile() docs again, maybe ERROR_BROKEN_PIPE really is legit. Fucking hell, this is why I normally try to steer clear of WinAPI...

Gąska

@deadfast "returns" here means the function return value (the BOOL), not the data read. 0 means FALSE. If you go to the ReadFile documentation, you'll see this:

If an anonymous pipe is being used and the write handle has been closed, when ReadFile attempts to read using the pipe's corresponding read handle, the function returns FALSE and GetLastError returns ERROR_BROKEN_PIPE.

In other words: create pipe, create process with inherited write handle as its stdin, close your own write handle. Read for as long as ReadFile() returns TRUE. If it returns FALSE, you check GetLastError(). If the error is ERROR_BROKEN_PIPE, it means you've read all the data successfully and the other process exited. If the error is something else, it means something went wrong.

Yes, the WinAPI docs are a mess. But it's still light years ahead of most other software.

Deadfast

Yes, the WinAPI docs are a mess. But it's still light years ahead of most other software.

Not compared to .NET which is what I normally try to use when doing stuff like this. Unfortunately this is a legacy application with code so terrible that even a busy wait is an improvement. Honestly, I consider the utter lack of any Unicode support a bigger issue.

I've ended up switching to checking the return value and:

If error is ERROR_BROKEN_PIPE, we're done.
If error is ERROR_MORE_DATA, keep reading.
If error is anything else, error out.

I also removed that zero-out. It was just a quick and lazy way of ensuring the string was null terminated for appending to the stream which I don't use in the actual code.

Thanks everyone!

cvi

No, it will loop around one more time and read remaining input.

Sorry, you're right. I missed the || dataSize != 0 in the while condition. :-/

Deadfast

@gąska said in Reading stdout with WinAPI:

No, it will loop around one more time and read remaining input.

Sorry, you're right. I missed the || dataSize != 0 in the while condition. :-/

Full disclosure: I edited that in a bit later when I realized that myself. You may have seen the original version.

ObjectMike

If you're unsure of the proper way to do it, you could look at the .net process launching code.

Reference Source

There seems to be an interesting comment on line 1982 about std io handles.

Gąska

@gąska said in Reading stdout with WinAPI:

Yes, the WinAPI docs are a mess. But it's still light years ahead of most other software.

Not compared to .NET which is what I normally try to use when doing stuff like this.

Some more advanced parts of .NET have no documentation whatsoever. WPF API reference has missing documentation in many places, and for a large number of items that are documented, the documentation explains nothing. The overall quality of .NET docs is very similar to WinAPI docs - which is no surprise considering they both come from Microsoft.

If error is ERROR_BROKEN_PIPE, we're done.

Yes. Just don't forget to WAIT for exit status. There's a possibility the pipe closes before the process actually exits.

If error is ERROR_MORE_DATA, keep reading.

The docs say it can't happen with anonymous pipes. It might turn out to be dead code.

If error is anything else, error out.

That's right.

Gąska

@mikehurley said in Reading stdout with WinAPI:

There seems to be an interesting comment on line 1982 about std io handles.

Very interesting indeed. The article it refers to, as well as the comment itself, both say that the child can close the handle and bad things will happen for the parent - suggesting that inheriting handles doesn't clone them after all; contradicting the article about pipes I linked upthread. Gosh, this is all so confusing.

I think I'll fire off an email to Raymond Chen to answer it once and for all.

cvi

The article it refers to, as well as the comment itself, both say that the child can close the handle and bad things will happen for the parent

Yeah, I don't understand that part 100% either. So what if the child also closes the parent-end handles? (Then again, not passing them to the child is probably a good idea anyway.)

But the key point is the following comment in the sample code from the article:

     // Close pipe handles (do not continue to modify the parent).
     // You need to make sure that no handles to the write end of the
     // output pipe are maintained in this process or else the pipe will
     // not close when the child process exits and the ReadFile will hang.
     if (!CloseHandle(hOutputWrite)) DisplayError("CloseHandle");
     if (!CloseHandle(hInputRead )) DisplayError("CloseHandle");
     if (!CloseHandle(hErrorWrite)) DisplayError("CloseHandle");

That was exactly what @Deadfast was experiencing.

blakeyrat

@deadfast Actually, reading the ReadFile() docs again, maybe ERROR_BROKEN_PIPE really is legit. Fucking hell, this is why I normally try to steer clear of WinAPI...

(Just FYI, this is really easy in .NET.)

(EDIT: oh you knew that but you have to work in a crappy legacy language because it's a crappy legacy program. Condolences.)

Gąska

@blakeyrat said in Reading stdout with WinAPI:

@deadfast said in Reading stdout with WinAPI:

@deadfast Actually, reading the ReadFile() docs again, maybe ERROR_BROKEN_PIPE really is legit. Fucking hell, this is why I normally try to steer clear of WinAPI...

(Just FYI, this is really easy in .NET.)

High-level libraries provide high-level solutions to high-level problems. News at 11.

Deadfast

@blakeyrat said in Reading stdout with WinAPI:

it's a crappy legacy program

You have no idea. I've barely scratched the surface. My favourite so far is a homegrown lousy knock-off of std::ostringstream that doesn't append the null terminator when you append a string to it...

Thankfully I don't have to look at it much further, I'm just helping out a coworker because, for some reason, it's most broken on my PC.

ObjectMike