Something better than memcpy?



  •  I've been wroking on a project that is half C++ (server backend) and half java (applet frontend). The two communicate by sending a buffer through an open socket. The original creator of the project (who sold it to us) used a non-dynamic memcpy on the C++ side to create the buffer. Now whenever we edit an object, stuff totally unrelated to it breaks because of the memcpy. Is there an easy fix for this? 

     

    Thanks,

    Malfist 



  • I'm thinking of something like the seralization in Java with objectstreams to pass the object with all its parts through one stream without having to explicity declare each and every variable in an object. 



  • I find your question incomprehensible. 



  •  I appologize, let me try again.

     

    Me and one other programmer works on a project called spacedomination (it's live, google it if you want to look at it). It has a java applet frontend where players login and play on the game, and it have a C++ backend that runs the server.

     They communicate with each other through streams on an open connection. The server has a port that is open that it listens for data from the applets and the applet listens for messages from the server.

    When they exchange information, it's done with a case called buffer that is a byte array. The applet can just cast everything into bytes and transfer it, but the server cannot do that.

    The server creates a buffer (in this case, a char array) and begins appending information to it using memcpy. For example, here's some code: 

    void sendWorldScan(Session *sp,int worldID,worldRec *wp){
    
    char buf[BUFFERSIZE];
    
    int len,size;
    
    
    
    if (wp==NULL) {
    
    	return;
    
    }
    
    gLog.logMsg(Log::LOG_DEBUG_VERBOSE, 4790, "Sending world scan.");
    
    
    
    len = 2;
    
    memcpy(buf+len,&worldID,sizeof(int));
    
    len += sizeof(int);
    
    memcpy(buf+len,wp,4*sizeof(int));
    
    len += 4*sizeof(int);
    
    memcpy(buf+len,&wp->maxMerchant,sizeof(int));
    
    len += sizeof(int);
    
    memcpy(buf+len,&wp->maxBeacon,sizeof(int));
    
    len += sizeof(int);
    
    memcpy(buf+len,&wp->maxStardock,sizeof(int));
    
    len += sizeof(int);
    
    memcpy(buf+len,&wp->sector,sizeof(int));
    
    len += sizeof(int);
    
    memcpy(buf+len,&wp->worldID,sizeof(int));
    
    len += sizeof(int);
    
    
    
    size = 1+strlen(wp->name);
    
    memcpy(buf+len,wp->name,size);
    
    len += size;
    
    
    
    buf[0] = len;
    
    buf[1] = mfWorldScan;
    
    sp->send(buf);
    

    }

    But when I alter a class, the memcpy doesn't match up anymore and causes some random, totally unrelated error to occur, usually a segmentation error due to an unintallized variable. Is there a better way to do this? A more dynamic way, something like java's object serialization that the applet and the server can decode into it's respective objects?



  • are you sending a serialized java object or something through the stream?  I'm assuming that you have a well defined protocol that the server and client use to exchange information.  However, that seems to not be the case (you're relying on an object being the same tends to make me think you are relying on the java serialization to be the same! 



  • I see no reason at all to use memcpy to copy single ints.

    memcpy is fast when copying a number of words, but there is probably some overhead when copying only one int. (depending on compiler, architecture etc.)
    I would instead just use something like:

    *(destination++) = wp->maxStardock;

    But why not define an object and send that?

    worldScan* buf = new worldScan(worldId, wp);
    sp->send((char*) buf);

    would be much easier to read, more consistent, and could even be made to result in exactly the same char array being sent.



  • I have a feeling your problem lies here:

    @malfist said:

    memcpy(buf+len,wp,4*sizeof(int));
     

    At least use wp->someValue instead of just copying the first 4 words of the object.



  • @olm said:

    I see no reason at all to use memcpy to copy single ints.
    memcpy is fast when copying a number of words, but there is probably some overhead when copying only one int. (depending on compiler, architecture etc.)
    I would instead just use something like:

    *(destination++) = wp->maxStardock;

     

    That is valid only if the destination pointer is currently aligned for the host platform. memcpy is how you move an aligned int into an unaligned (or possibly-unaligned) position in a buffer.

    On a good compiler (anything but msvc), there will be no major difference between the code generated for memcpy and for = when moving an int into an aligned pointer. This is accomplished by a fantastically complex blob of pure evil.
     



  • Oh. I actually read it as an int array being treated as a char array. I see now that there is an offset of two chars :-)

    I still thinks it could be done more gracefully however, e.g. keep aligned untill passed to sp->send().


    Back to the problem:
    What do you mean by "memcpy doesn't match up anymore and causes some random, totally unrelated error to occur"?
    memcpy() just uses the pointers passed to it, so if the wrong data is copied, you have a problem with your pointers.

    Other things to check:
    The four first words of the object - are you completely sure that they contain what you think they contain? And still, why not just refer to them with wp->valueName?
    Buffer overflow - you don't check for the max size of wp->name.



  • Fairly easy to do with a structure (as stated before) In VC; alignment isn't a worry if you use pragma pack (1) (in VC anyway) - just make sure 'int' is the same size.... the only twist is the variable length field 'name' which is a null-terminated string.

     #pragma pack(1)
    struct S_WorldScan
    {
      char len;
      char id;
      int worldID, maxBeacon, etc, etc...;
      char name[0];
    }

    ...

    int len = sizeof(S_WorldScan) + strlen(wp->name) + 1;
    S_WorldScan *worldScanPacket=(S_WorldScan *)malloc(len);
    worldScanPacket->len = (char) len;
    worldScanPacket->id = mfWolrdScan;
    memcpy( worldScan->name, wp->name, strlen(wp->name) + 1); // or just strcpy will do here)
    worldScanPacket->sector = wp->sector;
    worldScanPacket->blah = wp->blah;
    ....
    sp->send(worldScanPacket);
    free(worldScanPacket);

    (if you can, just alter the wp structure to match the packet format and then you can skip most of that, and send off wp as it is... )

    yes - this is a quick and dirty, and sloppy solution - as another reader pointed out - it looks like you don't have a well defined protocol; that will continue to be a pain in your behind.

     (edit: i can't remember if it's name[0] or name[] -- but you get the idea)



  • @alapalme said:

    In VC; alignment isn't a worry if (...)

     

    A popular but entirely wrong idea. It has got nothing to do with msvc. The x86 chip, rather than raising an exception on an unaligned access (like most), silently converts it into two accesses. This completely buggers up the pipeline, so the performance hit can be hundreds of cycles.



  • I wasn't suggesting that msvc 'made it work'- I meant that afaik - msvc supports 'pragma pack' .  I'm not sure which other compilers do.  

    Thanks for poiting out the performance hit.  Hopefully he can afford it!



  • @malfist said:

    But when I alter a class, the memcpy doesn't match up anymore and causes some random, totally unrelated error to occur, usually a segmentation error due to an unintallized variable. Is there a better way to do this? A more dynamic way, something like java's object serialization that the applet and the server can decode into it's respective objects?

     

    Unless as suggested you just want to send over the raw data that's contained in a data structure, there's not really much else you can do with what's built into C++ to make that kind of thing easier. As far as I know C++ doesn't have the ability to analyze itself sufficiently to automate such things. Another problem with the memcpy approach and the raw data structure approach is that your serialization code has no idea what type of data it's sending, so if you need to change things later, you could be in alot of trouble. Imagine what would happen if your boss suddenly bought $50000 of new server equipment and you suddenly discover that the CPUs are all big endian instead of small endian.

    I've done some work on game network programming myself, the network middleware I used had a serialization class which you'd use something like this:

    void MyDataPacket::Serialize( Stream &inOutData)

       inOutData.readWriteUnsignedInt(&mIntMember);

       inOutData.readWriteBool(&mBoolMember);

       inOutData.readWriteString(mStringMember, cMaxStringLength);

        // etc...

    }

    You still have the problem that you need to remember to feed all the data members into the stream in the right order, but there are alot of advantages:

    • The stream understands the type of data in each member so it can byte swap it if need be
    • The stream can check buffer lengths for you and generate warnings or exceptions if unexpected data comes in
    • The stream can do compression, like decide to send a bool in one bit instead of however many bits your compiler thinks are in a "bool" primitive
    • You can rewrite the stream to handle different packet formats if you decide to change how the client wants to recieve data
    • You can make new versions of the stream class that do clever tricks, like stream data in or out of files or accross memory to another part of the application (useful for constructing debugging logs so you can play back a sequence of recieved packets into the system or something)
    The only other approach I can think of that I've come across is how Unreal Engine handles this kind of thing. When you code in Unreal Script, which handles most of the actual gameplay logic, the Unreal Script compiler can actually construct C++ analogues of script classes allowing you to more easily integrate script and C++ together. The script compiler can also use the syntax of the script to figure out which data members need to be synchronized and how, it then somehow feeds this data to the game engine which uses it to handle data serialization.

    This is a pretty overkill method in my opinion, and may end up just making things more complicated for you in the long run. Though I guess you could set up such a system so that it would take a class defined in your made up language of choice and then generate Java and C++ source files from that class with code handling all the data I/O.


  •  If I understand you correctly, your main problem is to gracefully detect version errors. What about sending a serialization version as a "header field" (basically the first entry), and break if that doesn't match? 



  • Whenever I see the words "pointers" and "something totally unrelated breaks", I know for sure what's going on.  You can try to fix thisby rewriting as much of the relevant code as you like, but you sir have a pointer bug, and unless this is the only part of your code you use pointers in, you will very likely continue to have a pointer bug afterwards.



  • @mfah said:

    Whenever I see the words "pointers" and "something totally unrelated breaks", I know for sure what's going on.  You can try to fix thisby rewriting as much of the relevant code as you like, but you sir have a pointer bug, and unless this is the only part of your code you use pointers in, you will very likely continue to have a pointer bug afterwards.

     

    Do I understand correctly that you think the general problem is using pointers, and that the fix is to rewrite the whole code base to not use pointers?

    Pointers are not evil, and they are often necessary. (Even though some people think they are not using pointers when in fact they use lots of classes/functions which encapsulate pointers.)
    The bugs with pointers are just generally more visible than other bugs, (E.g. uninitialized pointer vs. uninitialized int).
    - and the concept of pointers can be confusing to some people.

    And no - bugs are not solved by rewriting code. Bugs are solved by finding out exactly what is wrong and then changing that. (I know - this is not a perfect world...)

     


Log in to reply