LogoLogo
Ops IntelligenceAsset IntelligenceObservabilityRobotic Data
  • Introduction
  • How it Works
  • Getting Started
  • Glossary
  • Implementer Guide
    • cfxDimensions Installation
      • Hardware and Software
      • cfxDimenions on VMware vSphere
        • Post cfxDimensions VM Installation
      • SSL Certificates Installation
      • cfxDimensions Setup & Install
        • Known Issues
      • cfxDimensions High Availability
        • GlusterFS Operations
        • Minio Operations
        • MariaDB Operations
      • cfxDimensions Start, Stop order
      • Macaw CLI
        • macaw CLI Installation
          • macaw CLI v2.1.17
        • macaw setup
        • macaw infra
        • macaw platform
        • macaw user
        • macaw application
        • macaw status
        • macaw services
        • macaw clambda
        • macaw techsupport
        • macaw backup
        • macaw restore
        • macaw reset
      • Release Notes
        • cfxDimensions v2.0.3
        • cfxDimensions v2.1.17
        • cfxDimensions v2.2.20
    • cfxDimensions Backup & Restore
    • cfxOIA Installation
    • cfxOIA Application Services
    • cfxOIA Release Notes
      • cfxOIA v5.1.5
      • cfxOIA v5.1.5.2
      • cfxOIA v5.1.5.3
      • cfxOIA v6.0.0
      • cfxOIA v6.1.0
  • KEY FEATURES GUIDE
    • Incident Management
      • Incidents Overview
      • Create Incident
      • Incident States
      • Accessing Incident
        • Stack
        • Alerts
        • Metrics & Logs
        • Insights
        • Collaboration
        • Diagnostics
        • Remediation
        • Attachments
        • Activities
      • Incident Actions
    • Alert Management
      • Alerts Overview
      • Alert Analytics
      • Alert States
      • Alert Sources
    • Advanced Alert Configuration
      • Alert Mappings
      • Alert Enrichment
      • Alert Correlation & Suppression
        • Creating and Updating Correlation Policies
        • Creating and Updating Suppression Policies
        • Correlation Recommendations
    • ML Driven Operations
    • Data Exploration
    • RDA (Robotic Data Automation)
      • Accessing UI
      • Sources Addition and Configuration
      • Check Connectivity
      • Proxy Settings
      • Explore
        • Bots
        • Pipelines
        • Schedules
        • Jobs
    • Analytics
  • UI & PORTAL FEATURES GUIDE
    • Filters Management
    • Customizing Table Views
    • Exporting Data
  • Administrator Guide
    • User Roles & RBAC
    • Collaboration
    • Projects
      • How to add Project
      • Configure Project
        • Stacks
        • Incidents
        • Alerts
        • Messages
          • Message Endpoints
            • Rest Data Consumer
            • Kafka Message Consumer
            • ServiceNow SaaS
            • Webhook with Basic Authentication
          • Message Mappings
        • Teams
        • Datasources
        • Resolution Codes
  • INTEGRATIONS GUIDE
    • Integrations Overview
    • Featured Integrations
      • AppDynamics
      • Dynatrace
      • Microsoft Teams
      • NetApp Cluster Mode
      • NetApp 7 Mode
      • Prometheus
      • ServiceNow
      • Slack
      • Splunk Enterprise
      • VMware vCenter
      • Zabbix
      • NodePing
      • Nagios XI
      • Check MK
      • VMware vRealize Operations
      • PRTG Network Monitor
      • Grafana
      • AWS Cloudwatch
      • ManageEngine OpManager
      • PagerDuty
Powered by GitBook
On this page
  1. Implementer Guide
  2. cfxDimensions Installation

cfxDimensions High Availability

cfxDimensions platform high availability architecture

PreviousKnown IssuesNextGlusterFS Operations

Last updated 3 years ago

CloudFabrix's cfxDimensions platform is built using cloud native architecture leveraging microservices using containers for all platform, infrastructure and application services. It supports deploying it in distributed mode to provide high availability and also allows to scale up based on enterprise environment's workload requirements.

Below picture provides high-level architecture of cfxDimensions platform when deployed in distributed mode for scale and with high availability feature.

cfxDimensions platform comes with the below services when deployed in HA & distributed mode.

Infrastructure Services:

  • HAProxy: Loadbalancer service which front-ends the cfxDimensions platform for UI and any incoming traffic access.

  • Apache Tomcat: For cfxDimensions platform's UI access

  • Kafka / Zookeeper: Message queue which is used by cfxDimensions platform & application services

  • Minio Object Storage: Object storage which is used by cfxDimensions platform & application services

  • MariaDB: Database for cfxDimensions platform and application services

  • Gluster: Shared storage filesystem which is used by cfxDimensions platform & application services

  • Elasticsearch*: Used for Metrics and logs data. (Note: It is an optional service and it is not included by default during cfxDimensions platform deployment. It can be deployed as an independent service where needed)

Platform Services: cfxDimensions platform provides the core essential foundation services. These platform services provide critical services like identity management, encryption/decryption services for critical data, provisioner service for installing,updating & un-installing microservices, service registry etc. The below listed platform services are mandatory to make cfxDimensions platform functional.

  • Service Registry: All cfxDimensions platform supported application microservices register with centralized Service Registry regarding service listing, service lookups, and interactions among application microservices or with external clients.

  • Identity: Identity service provides user identity management, managing tenancy & local users. It supports integration with external identity management solution likes of LDAP/AD/SSO. This service is used by application services for identity management operations.

  • Locker: It provides a secure way to store and access credentials using multi-level encryptions and advanced security principles similar to that of popular cloud providers.

  • User-preferences: This service is used to store user specific preference settings in the UI. (Ex: Report settings, column selections, UI report layouts etc..)

  • Notification manager: It is responsible for sending event notifications to application services about all service lifecycle events.

  • Provisioner: This is service is responsible for provisioning, de-provisioning and upgrades of cfxDimensions application services. It supports manual or automated placement of application services among available (one or more) service nodes of the platform.

  • Console UI: It is a supporting service which is used to provide necessary supporting UI related war files for tomcat infrastructure service. It is used only during initial load time. (Note: This service will be deprecated in future)

Consol UI service is deployed as a single service instance in HA deployment as it is used only during initial loadup time after the initial install of Platform services. It is no longer used post deployment of platform services and run time.

Application Services:

When cfxDimensions platform is deployed in distributed for scale and HA mode in production environments, below list provides minimum number of server instance requirement which can tolerate one server node failure and to provide continuity of AIOps application's access.

cfxDimensions Service Type

Min. number of instances (VM / Baremetal)

Infrastructure Services

3

Platform Services

2

Application Services

2

cLambda (Serverless) Services

2

Note: As of current release, cfxDimensions infrastructure services are supported only in 3 server node deployment, if there is a requirement to deploy in more than 3 nodes, please contact CloudFabrix's technical support.

Infrastructure Services in Cluster Mode:

HAProxy: HAProxy is a software based load balancer for TCP and HTTP based applications. It is used within the cfxDimensions platform to loadbalance and high availability for infrastructure, platform and application services. Below listed are primary functions of HAProxy within the cfxDimensions platform.

  • Front-end UI portal access, alerts/events & API access over HTTP(s) with SSL

  • MariaDB access to platform and application services when MariaDB is deployed in cluster mode

  • Minio object storage access to platform and application services

  • Service registry access among application services

HAProxy service is containarized and configured in a specific way to be compatible with cfxDimensions platform and deployed in Active/Standby mode. It is deployed among 2 of the 3 node infrastructure service nodes. HAProxy's front end and back-end cluster IPs are virtual IPs which are configured and managed through keepalived service.

Keepalived: Keepalived is a service which is included part of cfxDimensions platform as a standard linux service. Keepalived uses the IP Virtual Server (IPVS) kernel module to provide transport layer (Layer 4) load balancing, redirecting requests for network-based services to individual members of a server cluster. IPVS monitors the status of each server and uses the Virtual Router Redundancy Protocol (VRRP) to implement high availability.

Keepalived service is configured as a Linux service on both Active/Standby nodes of HAProxy service.

It's primary function is to manage the HAProxy's front-end & back-end virtual IPs providing high-availability at network layer. When HAProxy active node (or service) goes down, keepalived detects the failure instantly and automatically transfers the HAProxy cluster virtual IP to HAProxy standby node. The application traffic (internal or external) will be re-routed and processed through HAProxy standby node seemlessly.

Kafka & Zookeeper: Kafka is used as a message queue to publish (write) and subscribe to (read) streams of events among application services within the cfxDimensions platform. Kafka natively supports deploying in distributed mode which allows to scale and provide high availability. Zookeeper service is also used along with Kafka service which also deployed in distributed mode to provide high availability and scale.

Kafka & Zookeeper services are containarized and configured in a specific way to be compatible with cfxDimensions platform. When deployed in 3 node configuration along with default replication settings (replication factore is set to 2), it provides 1 node failure tolerance.

Kafka & Zookeeper disk mount points on each cluster node:

  • Data mount point: (Kafka & Zookeeper)

    • /kafka-logs1

    • /kafka-logs2

    • /Zookeeper

  • Service logs path: (Kafka & Zookeeper)

    • /opt/macaw/log/kafka/<node-ip>

    • /opt/macaw/log/zookeeper/<node-ip>

Minio Object Storage: MinIO is a high performance Object Storage which is API compatible with Amazon S3 cloud storage service which can be deployed in distributed mode for scale and high availability. It is primarily used to store and query the configuration, ML experiment data, pipelines, alert bundles, inventory and analytical data files etc. Minio service is containarized and configured in a specific way to be compatible with cfxDimensions platform. When deployed in 3 node cluster, it is configured with 12 disk mount points (4 disk mount points per node) in total with Minio rrs storage class/policy is set to EC:4 (i.e. 8 data disks and 4 parity disks)

Minio object storage disk mount points on each cluster node:

  • Data mount point: (Minio object storage)

    • /minio-data01

    • /minio-data02

    • /minio-data03

    • /minio-data04

  • Logs:

    • Minio container logs

docker ps | grep -i minio
docker logs <minio-container-id>

Gluster shared filesystem: GlusterFS is a scalable and distributed network filesystem which is used to share the filesystem among cfxDimensions platform application and infrastructure services to share and store service logs, certificates and configuration. Gluster service is containarized and configured in a specific way to be compatible with cfxDimensions platform.

Gluster is deployed in 3 node configuration similar to other infrastructure services. It is deployed as 2 data replication nodes and 1 arbiter node. Each data replication node will contain a data brick volume which is used for data replication between Gluster cluster nodes. For data replication, only two data brick (volume) nodes are used while third node acts as a arbiter node which is aimed at preventing split-brains and providing the data consistency guarantees as a normal replica 3 volume without consuming disk space.

The configured Gluster volume name is 'macaw' and it is mounted on all of the cfxDimensions platform VMs as /opt/macaw (VMs: Platform, Infrastructure (DB/Data), Application services & cLambda)

GlusterFS shared filesystem mount point on each cluster node:

  • Data mount point: (Gluster data brick)

    • /glusterfs01

  • Logs:

    • Gluster container logs

docker ps | grep -i gluster
docker logs <minio-container-id>

MariaDB Database: MariaDB is a relational database application service which is used to store user configuration, platform & application configuration, alerts and incident data of cfxDimenions platform and respective application services. MariaDB supports high availability natively and it can be deployed as Master/Slave or Master/Master configuration using Galera clustering feature. Within the cfxDimensions platform MariaDB is deployed in Master/Master (Galera cluster) configuration. MariaDB service is containarized and configured in a specific way to be compatible with cfxDimensions platform and it's application services.

MariaDB database mount point on each cluster node:

  • Data mount point:

    • /var/mysql

  • DB service logs path:

    • /opt/macaw/shared/log/mariadb/<node-ip>/mariadb.log

For AIOps application services, please refer section

For detailed general documentation, please refer

For detailed general documentation, please refer

For detailed general documentation, please refer

For details general documentation, please refer and

cfxOIA Application Services
About Kafka & Zookeeper
About Minio Object Storage
About GlusterFS
About MariaDB
About Galera Cluster