Let's talk about Ceph


  • Grade A Premium Asshole

    Is anyone here currently using it? I would like to discuss some details of use with someone who is actually using it in production, and I would prefer to hear the opinion of a cynical, skeptical bastard if at all possible so this forum seems like the perfect place to ask such questions.

    One of our services has a one-to-one relationship with the client side process, which works very well overall, but is not very efficient in terms of usage of capacity. Each server has quite a bit of wasted space, and expanding volumes when a server reaches capacity is a bit of a process. Ceph seems like it could help solve those issues, and could be a pretty easy replacement for our current implementation of storage.

    For those who may have used it, how tolerant is it of differences in nodes? If I have a set of nodes that all have X number of Y capacity disks, and down the road I add on new nodes with 2Y capacity disks due to the progression of technology and the reduction in price of disks, does that impact the performance of the cluster?

    I am having difficulty finding any references in regards to replication, or is that seen as not needed? Currently we also have a 1-to-1 relationship between master and slave storage nodes. All data is replicated to a backup storage node so in the unlikely event something happens to the master, we have a process where we can promote the slave node and get back to where we were before whatever happened. The slave node also has nothing but the basics running on it, not even our own software, so if we royally fucked something up we can bin the master node and recover to slave. In the several years we have been operating we have never had to use it, but I am a belt and suspenders sort of guy and I don't know how I would feel not having this. But, going to a consolidated storage cluster would bring significant savings due to not having 2N hardware. But then we find ourselves in an "all your (customer's) eggs in one basket" situation and that does not seem prudent, but I cannot find information about real world implementations of this that cover this concern. Can it be trusted enough to not worry about the extra layer of redundancy?

    I am sure there are other things, but no sense worrying about it unless someone here on the forums has the experience required to answer them.



  • @Polygeekery
    Not an answer to your questions, but since I may also have to look into this in the future: Is there a specific reason why you're looking at Ceph and not GlusterFS?


  • đźš˝ Regular

    @Polygeekery I just want to acknowledge I see what you did there.


  • Grade A Premium Asshole

    @dfdub Ceph seems like an easier integration to us with the way we currently do things. Admittedly I am in the cursory stages of research, but Ceph natively supports protocols such as iSCSI and Ceph file system. For us, it seems like it could be a drop-in replacement with (what should be) an easy migration path. I will look further in to GlusterFS, but I think it would require a pretty extensive rewrite to support it for us.


  • Grade A Premium Asshole

    @Zecc I was wondering if anyone would get that. I nearly started the thread with a set of adapted lyrics.



  • @Polygeekery Cool, let me know what you find out.


  • Grade A Premium Asshole

    @dfdub said in Let's talk about Ceph:

    @Polygeekery Cool, let me know what you find out.

    Apparently fuck all because it doesn't seem that anyone outside of CERN uses it.

    I think I have 5-6 Gen8 HP Microservers in my storage area. Once my schedule clears it may be time to try things out.

    For another opinion, why were you looking at GlusterGS over Ceph?



  • @Polygeekery said in Let's talk about Ceph:

    For another opinion, why were you looking at GlusterGS over Ceph?

    If you're looking for an insightful answer, I'm afraid I don't have one. So far my train of thought was:

    • Proxmox supports both GlusterFS and Ceph as storage backends
    • Both provide distributed POSIX file systems
    • Ceph seems to be primarily a distributed object storage, with a file system built on top, while GlusterFS is primarily a distributed file system.

    Due to lack of experiments and further research, I have no idea whether the latter is actually important in practice.


  • Considered Harmful

    @Polygeekery said in Let's talk about Ceph:

    I am a belt and suspenders sort of guy

    2013-08-12-Get-My-Belt.png


  • BINNED

    @error

    "Now get my FANNY PACK."



  • BINNED

    @dfdub No knowledge nor experience with these kind of things, but since you mentioned GlusterFS, my colleagues might appreciate it if I plug BeeGFS.


  • Grade A Premium Asshole

    @topspin well isn't this space just getting a little cramped?

    If we went back to the OP, could you answer those questions in regards to your BeeGFS solution?

    Please tell me there is a "Staying alive" tagline for the product? If not, it is a missed marketing opportunity. That shit gets stuck in your head.


  • BINNED

    @Polygeekery I’m afraid I can’t be of too much help here, sorry. I only mentioned it because it’s (afaict) on topic and you maybe might want to check it out, not because I’m suggesting it actually is a good solution.

    Yeah, definitely sounds like it, but the logo suggests the pun is on bee, not Bee Gees.



  • @topspin said in Let's talk about Ceph:

    Yeah, definitely sounds like it, but the logo suggests the pun is on bee, not Bee Gees.

    To be fair, if you start punning on Bee Gees and using that in marketing, you might get a call from some copyright lawyers.


Log in to reply