Install ClickHouse on Ubuntu 20.04 - Step by step process to perform it








ClickHouse is a column-oriented database designed to address OLAP – Online Analytical Processing.

OLAP is a technique for advanced big data analysis.

The language ClickHouse uses is a variation of SQL, which helps beginners learn this query language faster.

Install of ClickHouse on Ubuntu involves a series of steps that includes adjusting the configuration file to enable listening over other IP address and remote access.

Here at Ibmi Media, as part of our Server Management Services, we regularly help our Customers to perform Ubuntu related Software Installation tasks.

In this context, we shall look into the steps to install ClickHouse on Ubuntu.


How to install ClickHouse on Ubuntu?

ClickHouse is an open-source analytics database developed for big data use cases.

Typically, the installation process involves the steps below:

i. Installing ClickHouse

ii. Starting the Service

iii. Change the listening IP address

iii. Enable Tabix

iv. Setting Up Firewall Rules


Next, Let us discuss each of them in detail.


1. Installing ClickHouse

In this section, we will install the ClickHouse server and client programs using apt.

Here, we need to add the repository of ClickHouse. 

To add the repository's GPG key and then to add the repository to the APT repositories, use the command below:

$ sudo apt-key adv --keyserver keyserver.ubuntu.com --recv E0C56BD4
$ echo "deb http://repo.yandex.ru/clickhouse/deb/stable/ main/" | sudo tee /etc/apt/sources.list.d/clickhouse.list

Now, we shall update the packages:

$ sudo apt update

Next, we shall proceed to install the clickhouse-server and clickhouse-client packages:

$ sudo apt install clickhouse-server clickhouse-client

Here, it will ask to set a password for the default ClickHouse user. 

We will require this password to access ClickHouse server later.


2. Starting the Service

After the ClickHouse server and client installation, let us proceed to start the database service.

i. We can start the clickhouse-server service and verify its status with the commands below:

$ sudo service clickhouse-server start
$ sudo service clickhouse-server status

The output of the status command confirms that the server is running. 

ii. Further, we can also enable ClickHouse to run on boot with the command below:

$ sudo systemctl enable clickhouse-server

iii. Now that we have enabled ClickHouse, we can access ClickHouse with the password that we set during installation.

$ clickhouse-client --password

This will take us to the prompt where we can execute SQL statements create, update and delete databases, tables, etc. 

We can also execute queries to retrieve filtered data from this client prompt.

For instance, to create a test database use the command below:

ch:) CREATE DATABASE test;


3. Change the listening IP address

By default ClickHouse only listens on localhost. 

Thus we can access it only from the same server. However, we can change the listening IP address, by editing the /etc/clickhouse-server/config.xml file.

Open this file with any available text editor, find the string listen_host, and uncomment the line.

To make the ClickHouse server listen only on a specific IP address, we can edit the line as follows:

<listen_host>xxx.xxx.xxx.xxx</listen_host>

Replace xxx.xxx.xxx.xxx with the actual IP address.


4. Enable Tabix

To access the ClickHouse server through a web browser, we need to enable Tabix by editing the config.xml file as mentioned before. 

Open the config.xml file as we did earlier,  find the string http_server_default_response and uncomment the line.

After making changes to the config.xml file, we need to restart our clickhouse-server service:

# sudo systemctl restart clickhouse-server

Now, we can access ClickHouse over a browser with the link  http://Your_IP_Address:8123 and log in using the default user and password that we specified earlier in the installation process.


5. Setting Up Firewall Rules

This step should be followed if we intend to connect to the ClickHouse database server remotely. 

Remote connection requires ClickHouse to listen to an IP address other than localhost. 

Thus changing the listening IP address that we discussed earlier is the primary step to enable remote access.

ClickHouse's server listens on port 8123 for HTTP connections and port 9000 for connections from clickhouse-client.

Allow access to both ports for the remote server IP address with the following command:

$ sudo ufw allow from remote_server_ip/32 to any port 8123
$ sudo ufw allow from remote_server_ip/32 to any port 9000

We can add additional IPs such as our local machine’s address in the same manner if required.

Now to verify the remote connection, install the clickhouse-client on the remote server with the steps that we followed initially.  Then, start a client session by executing:

$ clickhouse-client --host your_server_ip --password

We will see an output that we have connected to the server.


How to Start ClickHouse Session ?

To start working with ClickHouse databases, launch the ClickHouse client. 

When you start a session, the procedure is similar to other SQL management systems.

To start the client, use the command:

$ clickhouse-client

You may get this error:

“Code: 516. DB::Exception: Received from localhost:9000. DB::Exception: default: Authentication failed: password is incorrect or there is no user with such name.”

When that error occurs, you need to define the password entered during the installation for the default user.

To do so, enter:

$ clickhouse-client --password test1234 --user default

Replace the sample password with your own.

The session starts.


[Need urgent assistance to install ClickHouse on Ubuntu?– We're available 24*7. ]



Conclusion

This article covers how to install ClickHouse on Ubuntu. Basically, ClickHouse is an open-source analytics database developed for big data use cases. 

Install of ClickHouse on Ubuntu involves a series of steps that includes adjusting the configuration file to enable listening over other IP address and remote access. 


Column-oriented databases store records in blocks grouped by columns instead of rows. 

By not loading data for columns absent in the query, column-oriented databases spend less time reading data while completing queries. 

As a result, these databases can compute and return results much faster than traditional row-based systems for certain workloads, such as OLAP.


Online Analytics Processing (OLAP) systems allow for organizing large amounts of data and performing complex queries. 

They are capable of managing petabytes of data and returning query results quickly. 

In this way, OLAP is useful for work in areas like data science and business analytics.


Aggregation queries are queries that operate on a set of values and return single output values. 

In analytics databases, these queries are run frequently and are well optimized by the database. 


Some aggregate functions supported by ClickHouse are:

1. count: returns the count of rows matching the conditions specified.

2. sum: returns the sum of selected column values.

3. avg: returns the average of selected column values.


Some ClickHouse-specific aggregate functions include:

1. uniq: returns an approximate number of distinct rows matched.

2. topK: returns an array of the most frequent values of a specific column using an approximation algorithm.


You can set up a ClickHouse database instance on your server and create a database and table, add data, perform queries, and delete the database.

You can start, stop, and check the ClickHouse service with a few commands.

To start the clickhouse-server, use:

$ sudo systemctl start clickhouse-server

The output does not return a confirmation.

To check the ClickHouse service status, enter:

$ sudo systemctl status clickhouse-server

To stop the ClickHouse server, run this command:

$ sudo systemctl stop clickhouse-server

To enable ClickHouse on boot:

$ sudo systemctl enable clickhouse-server

To start working with ClickHouse databases, launch the ClickHouse client. 

When you start a session, the procedure is similar to other SQL management systems.

To start the client, use the command:

$ clickhouse-client


For Linux Tutorials

We create Linux HowTos and Tutorials for Sys Admins. Visit us on LinuxAPT.com

Also for Tech related tips, Visit forum.outsourcepath.com or General Technical tips on www.outsourcepath.com