Software Defined Storage

Ceph Dashboard Configuration for Storage Health Metrics

Learn how to set up a Ceph Dashboard for storage health and metrics and have visibility into the health of your Ceph storage environment.

If you are using Ceph storage either in Proxmox or using Ceph storage in vanilla Linux outside of Proxmox, you likely want to have a way to see the health of your Ceph storage environment. Setting up a Ceph dashboard to see the health of your Ceph storage environment is a great way to have visibility on the health of your Ceph environment. Let’s look at how to setup a Ceph Dashboard for object storage health to see your storage health.

What is Ceph?

Ceph is software-defined object storage that makes use of local storage attached to hosts and logically shares this between the hosts and presents it as shared storage. This shared storage is replicated between the hosts or using erasure coding if you meet those requirements.

Setting up a Ceph dashboard

So, let’s look at the steps to setup a Ceph dashboard. In my configuration, I have three Ubuntu servers configured as a Swarm cluster and as Ceph nodes for the underlying storage with CephFS running on top. The steps to configure this as an overview, including the following steps:

  1. Enable the Prometheus module in Ceph
  2. Create a Prometheus configuration to scrape your Ceph servers
  3. Use Grafana to visualize the data scraped using Prometheus

1. Enable the Prometheus module in Ceph

Using the command line, there is an easy way to enable the Prometheus module in Ceph. How is this done? You can do that with the command:

ceph mgr module enable prometheus

As a note, you only have to do this on one of your hosts. It will then enable the module across the board.

Enabling the prometheus module in ceph
Enabling the prometheus module in ceph

After enabling, it is a good idea to make sure your Ceph hosts are listening on the exposed port of 9283. Below, you can see we can use the command:

netstat -tuln | grep 9283
Checking to make sure the prometheus port is open on the ceph hosts
Checking to make sure the prometheus port is open on the ceph hosts

After verifying the hosts are listing on the exposed port, we are good to move on to the next step.

2. Create a Prometheus configuration to scrape your Ceph servers

Next, we need to create a Prometheus configuration for our Prometheus server (mine is external to my Ceph servers) to scrape the configuration of Ceph from the exposed port we have enabled.

In case you want to see a quick overview of how you would spin up a Prometheus server using Docker Compose, note the following:

services:
  prometheus:
    image: portainer/template-swarm-monitoring:prometheus-v2.44.0 
    user: "1001" 
    environment: 
      - PUID=1001 
      - PGID=1001
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--log.level=error'
      - '--storage.tsdb.path=/prometheus'
      - '--storage.tsdb.retention.time=7d'
    deploy:
      replicas: 1
      restart_policy:
        condition: on-failure
    volumes:
      - type: bind
        source: /mnt/cephfs/prometheus
        target: /prometheus
      - type: bind
        source: /mnt/cephfs/prometheus/prometheus.yml  # Correct path to the config file on the host
        target: /etc/prometheus/prometheus.yml         # Where Prometheus expects to find the config
    networks:
      - npm-stack_nginxproxy

Note the following Prometheus configuration.

You can add this to your existing configuration in Prometheus as an additional job:

# Job for Ceph Manager metrics
  - job_name: 'ceph_mgr'
    static_configs:
      - targets:
          - '10.1.149.21:9283'  # Ceph Mgr node 1
          - '10.1.149.22:9283'  # Ceph Mgr node 2
          - '10.1.149.23:9283'  # Ceph Mgr node 3

If you don’t have any existing configuration, you can use the following code:

# Ceph configuration
global:
    external_labels:
        monitor: 'Ceph'

# Scrape Job
scrape_configs:
  - job_name: 'microceph'

    # Ceph's default for scrape_interval is 15s.
    scrape_interval: 15s

    # List of all the ceph-mgr instances along with default (or configured) port.
    static_configs:
    - targets: ['10.1.149.21:9283', '10.1.149.22:9283', '10.1.149.23:9283']

Once you restart your Prometheus server config so that it reads your latest configuration file, you should be able to see that it has discovered your Ceph endpoints.

Viewing the ceph endpoints in prometheus
Viewing the ceph endpoints in prometheus

3. Use Grafana to visualize the data scraped using Prometheus

Now that we have Prometheus up and running and have the Prometheus configuration setup, we can now use Grafana to visualize the data collected in Prometheus. When you go to your Grafana configuration, you can add a data source and point your connection to your Prometheus server.

Prometheus configuration in grafana as a data source
Prometheus configuration in grafana as a data source

Save and test the configuration. You should see the successful message that it was able to query the API.

Saving and testing the prometheus configuration
Saving and testing the prometheus configuration

Grafana community dashboards

So, you may be thinking, how do I get all the visualizations that I want to visualize the Prometheus data? Well, the Grafana community dashboards will save you a lot of the time and effort. You can go to the community site here: https://grafana.com/grafana/dashboards and search for Ceph. This will bring up a wealth of already published dashboards for your Ceph data.

Ceph community dashboard
Ceph community dashboard

Below is one of the ones that I settled on:

Ceph dashboard with grafana
Ceph dashboard with grafana

Wrapping up

Creating a Ceph dashboard is a great way to visualize and keep a check on your Ceph data to make sure things are healthy and performance is what you would expect. The steps are very straightforward and don’t take much to set up at all. Using the quick command line command to turn on Prometheus monitoring, you can easily enable the ability to monitor your Ceph cluster. Let me know what you are using if you are monitoring Ceph. I would be curious to know what solutions you are using.

Subscribe to VirtualizationHowto via Email ๐Ÿ””

Enter your email address to subscribe to this blog and receive notifications of new posts by email.



Brandon Lee

Brandon Lee is the Senior Writer, Engineer and owner at Virtualizationhowto.com, and a 7-time VMware vExpert, with over two decades of experience in Information Technology. Having worked for numerous Fortune 500 companies as well as in various industries, He has extensive experience in various IT segments and is a strong advocate for open source technologies. Brandon holds many industry certifications, loves the outdoors and spending time with family. Also, he goes through the effort of testing and troubleshooting issues, so you don't have to.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.