#1 Webinar: Getting Started with Ceph
Learn in this webinar:
• The architectural requirements of the Ceph Cluster
• The role of the core RADOS components
• What happens if an OSD fails
• How to spin up a cluster using a VM image
• What is required to expand the cluster
View recorded webinar
#2 Webinar: Intro to Ceph with OpenStack
Learn in this webinar:
• What you need to consider for selecting the best cloud storage system
• Overview of the Ceph architecture and unique features and benefits
• Best practices in deploying cloud storage with Ceph and OpenStack
View recorded webinar
Audience’s Questions
For our needs we might also need iSCSI-access to our storage. What are the experiences with using Linux as an iSCSI-target backed by Ceph?
Several users are doing this today and finding it a good solution that meets their needs. Inktank is working to contribute code that simplifies the process further by more tightly integrating Ceph with iSCSI target software. The patches are currently being tested for integration up stream.
How small OSDs can be? Would it make more sense to have larger amount of small OSD (example 24 OSDs 100GB each per cluster node). What are the best practices for configuring OSD? Can you cover difference between xfs, ext4 and brtfs for OSD? May FC SAN LUN be used as OSD?
There are lots of questions here: (1) OSDs can be arbitrarily small, but should be large enough to justify the CPU and memory resources. It makes sense to have one OSD per drive, and with SSDs having many 100GB OSDs may make sense for certain applications. (2) For general purpose guidelines, we recommend about 1GHz of CPU cycles and 2GB memory for spinning disks. This covers recovery scenarios and should use much less during standard operations. The series of performance-related blog posts on Ceph.com can help with sizing, and Inktank has professional services to help match configurations to application needs. (3) The performance blog posts explore some of the differences between the different filesystem choices for the OSD. In general, xfs is a safe middle choice, btrfs is great for peak performance but may be less reliable and have strange corner case performance, and ext4 is most ubiquitous and stable at the cost of lower performance. (4) It's certainly possible to use FC LUNs to build OSDs...as long as the LUNs are visibile to the Linux disk subsystem they can be used. It's unlikely, but you may need to tweak device start order in the OS to make sure the LUNs are available before Ceph needs them. If the hardware is already there, this might be a good way to experiment and prototype a solution. In the long run, using FC storage is discouraged because it adds considerable unnecessary costs to the solution.
So I have 1 SSD PER storage node for journaling?
Not necessarily. It depends on a number of factors. In some cases that may be sufficient, and in others the SSD can become a bottleneck and rapidly weak out. Different applications will have a different ideal ratio of SSD journals to spinning disks, taking into account rate of write IO and bandwidth requirements for the node.
Is there a limit to the number of physical disks possible in a single storage node? Or is it only the 500MB per disk recommendation? So is it practical to have something like only 6 storage nodes, but each one has plenty of ram - say 64GB, and that way each storage node can hold up to 128 disks?
The recommended sizing guidelines for OSDs is 1GHz CPU and 2GB memory per OSD in order to handle recovery scenarios smoothly. For large clusters and general purpose storage, something like a 36 drive node with dual 8-core CPUs and 96GB memory might work well. Specialized deployments may be able to use denser platforms. Keep in mind that node size also impacts failure domains, recovery times, and performance in degraded states. If a node with many OSDs fails in a small cluster, the remaining nodes may be too busy handling recovery tasks to meet application SLAs.
#3 Webinar: DreamHost Case Study: DreamObjects with Ceph
This webinar discusses best practices and lessons learned in creating DreamObjects, including the need to manage scale, speed, monitoring, uptime, security and cost.
View recorded webinar
#4 Webinar: Advanced Features of the Ceph Distributed Storage System Delivered by Sage Weil, Ceph Creator
Learn in this webinar:
• Deploying Ceph
• Enhance Deployment
• Block Devices
Is there a way to limit how much people can upload to the rados gateway? A user limit?
There are not software limitations to the number of users that Ceph's radosgw can support, it boils down to your hardware and system configuration. That said, Ceph supports a scale out strategy for radosgw nodes, if you need to support more users and/or traffic then you can simply add more.
What’s the least amount of nodes for a reasonable performance storage for hosting VMs?
The minimum number of nodes recommended for a production cluster is 3. This ensures that the cluster can maintain quorum across node failures and maintain data redundancy. It's difficult to comment on what's required for reasonable performance, because that depends on the VM demands, the power behind each cluster node, and what kinds of disks are used. Inktank provides professional services that can help tailor a solution to specific requirements, and optimize goals such are cost or electrical limitations when recommending a solution.
How much storage pr. storage node? You mentioned 8-12 OSD pr. server - are those 2-3 TB disks?
Typically, yes. We often see deployments where the node hardware supports 12 spinning disks in the 2-4TB range. Sometimes these are enhanced with SSDs for journaling or a separate high performance pool of storage.
You guys run storage nodes with no SSD journals - doesn't that hurt performance?
For many applications, it's possible to get good performance without using SSDs for journals. The journal device is written sequentially and played out to the rest of the OSD at a later time, For data that isn't being aggressively updated, journals on the OSD itself can be sufficient. On simple optimization is to partition the disk so that a few GB of the outer cylinders are used for the journal, and the remaining surface for the OSD data.
What is the normal bandwidth that the Ceph cluster is providing to the servers?
That depends on the size of the cluster and configuration of the individual nodes. One example might be to consider a node with 24 OSDs and a 10GbE link to the application servers. It should be possible to saturate the 10GbE link for that node by reading data from the OSDs and feeding it to the servers. Scalability is expected to be linear, so adding more storage nodes and application servers should increase the aggregate bandwidth in the cluster. Check out the performance blog posts on Ceph.com for more insight on sizing individual nodes.
How is the performance with Xen compared to KVM or others?
It should be similar, although we haven't performed any benchmarks to compare the two at this time. KVM and libvirt can talk to Ceph's block devices directly, while Xen currently requires first mapping the Ceph block device via the Linux kernel driver. Most of the IO stays in the kernel either way, but the KVM approach bypasses a few layers which may improve latency to some degree.
Have you compared XFS OSD to ext4? Any benchmark data for using XFS over ext4?
Take a look at the performance articles written by Mark Nelson on the Ceph.com blogs. There is a blog article comparing different OSD file system options.