Installing GoAccess on Ubuntu 20.04 - Steps to do it ?









Are you trying to install GoAccess on Ubuntu?

This guide is for you.

GoAccess is a free, open source and real time web server log analyzer tool that can be used to analyze and view web server logs.
Here at Ibmi Media, as part of our Server Management Services, we regularly help our Customers to perform Software Installation tasks.
In this context, we shall look into the steps to install GoAccess on Ubuntu 20.04 web server.

More information about GoAccess on Ubuntu ?

GoAccess is a tool for monitoring web server logs in realtime. In this article let’s see how to install GoAccess on Ubuntu 20.04 web server.
We will access the Apache log files with GoAccess before reviewing the modules available and navigation shortcuts on the command-line interface.
In order to learn how to Install GoAccess on Ubuntu, our Support Techs suggest having:
i. One Ubuntu 20.04 server with a non-root user with Sudo privileges and a firewall.
ii. Apache

To install the apache web server, we use the command:

$ sudo apt-get -y install apache2

Then, we start and enable the webserver to run at boot time.

$ sudo systemctl start apache2
$ sudo systemctl enable apache2

Next, allow the HTTP port through the system firewall.

$ sudo firewall-cmd –add-service=http –permanent
$ sudo firewall-cmd –reload

How to Install GoAccess on Ubuntu ?

Here, we will use the following steps to Install GoAccess on Ubuntu.

Step 1 – Install GoAccess

In this step, let's see how our Support Engineers install the GoAccess tool and its dependencies.
Initially, we ensure that the package database and system are up to date:

$ sudo apt update
$ sudo apt full-upgrade

GoAccess tool is available in the Ubuntu repos, but this is not usually the latest stable version.
To ensure that we have the latest stable version of GoAccess on the server, we can compile from source or use the official GoAccess repository on Ubuntu.

Method 1 – Compile from source
First, we install the dependencies to compile GoAccess from the source:

$ sudo apt install libncursesw5-dev libgeoip-dev libtokyocabinet-dev build-essential

We install the following dependencies:
i. build-essential: installs many packages, includes gcc compilers for C, C+, etc, and make for building the GoAccess makefile.
ii. libncursesw5-dev: installs the ncurses library that GoAccess uses for its command-line interface.
iii. libgeoip-dev: includes the necessary files for the GeoIP library.
iv. libtokyocabinet-dev: provides database dependencies for higher performance.

Next, we download the latest version of the GoAccess from the website:

$ wget http://tar.goaccess.io/goaccess-1.4.tar.gz

Once the download completes, we extract the archive with:

$ tar -xzvf goaccess-1.4.tar.gz

Then change into the newly unpacked directory:

$ cd goaccess-1.4/

We run the configure script found inside this directory:

$ ./configure –enable-utf8 –enable-geoip=legacy

The –enable-utf8 flag ensures GoAccess compiles with wide character support, while –enable-geoip enables GeoLocation support with the original GeoIP databases.
We can replace legacy with mmdb to use the enhanced GeoIP2 databases instead.

Our output will be similar to the following:

Output
. . .
Your build configuration:

Prefix : /usr/local
Package : goaccess
Version : 1.4
Compiler flags : -pthread
Linker flags : -lnsl -lncursesw -lGeoIP -lpthread
UTF-8 support : yes
Dynamic buffer : no
Geolocation : GeoIP Legacy
Storage method : In-Memory with On-Disk Persitance Storage
TLS/SSL : no
Bugs : hello@goaccess.io

Then, we run the make command to build the makefile to install GoAccess:

$ make

Finally, we install GoAccess using the previously created makefile to the system:

$ sudo make install

We ensure the installation was successful by:

$ goaccess –version

We will receive the following output:
Output

GoAccess – 1.4.
For more details visit: http://goaccess.io
Copyright (C) 2009-2020 by Gerardo Orellana


Build configure arguments:

–enable-utf8
–enable-geoip=legacy

Method 2 – Use the Official GoAccess Repos
This method is preferable if we like it to update to a newer version automatically during system upgrades. So that it doesn’t have to compile from source for each new release.
We need to add the repository to the server first:

$ echo “deb http://deb.goaccess.io/ $(lsb_release -cs) main” | sudo tee -a /etc/apt/sources.list.d/goaccess.list

First, we get the release name of the distribution and then pipe that to a tee, which appends to the file /etc/apt/sources.list.d/goaccess.list.

With the repository in our sources list, we can now download the GPG key to verify the signature:

$ wget -O – https://deb.goaccess.io/gnugpg.key | sudo apt-key –keyring /etc/apt/trusted.gpg.d/goaccess.gpg add –

Next, we update the package database with the following command:

$ sudo apt update

Finally, install GoAccess:

$ sudo apt install goaccess

Now, we can access and edit its configuration file. Thereby we can make changes to how the program runs.

Step 2 – Edit the GoAccess Configuration

GoAccess comes with a configuration file where we can make permanent changes to the behavior of the program. We edit this file to specify the time, date, and log format so that GoAccess knows how to parse the server logs.
The configuration file will be at ~/.goaccessrc or %sysconfdir%/goaccess.conf where %sysconfdir% is either /etc/, /usr/etc/, or /usr/local/etc/.
To find the location of the config file on the server, we run:

$ goaccess –dcf

Sample output

/etc/goaccess/goaccess.conf

We edit this config file using nano:

$ sudo nano /etc/goaccess/goaccess.conf


If this file does not exist on the server, we ensure to create it first and populate it with the contents of the goaccess.conf file on GitHub.

Let us enable the time-format setting for Apache first. This setting specifies the log-format time and allows GoAccess to parse any plain-text Apache log files that meet the supported formatting criteria.
# The following time format works with any of the
# Apache/NGINX’s log formats below.
#

time-format %H:%M:%S

Next, we uncomment the Apache date-format setting that specifies the log-format date:
# The following date format works with any of the
# Apache/NGINX’s log formats below.
#

date-format %d/%b/%Y

Finally, we uncomment the log-format setting.
Several lines change this setting. However, the exact one to uncomment depends on the way the webserver is set up.
If we have a non-virtual hosts setup, uncomment the following log-format line:
# NCSA Combined Log Format

log-format %h %^[%d:%t %^] “%r” %s %b “%R” “%u”

Otherwise, if we have virtual hosts set up, uncomment the following line instead:
# NCSA Combined Log Format with Virtual Host

log-format %v:%^ %h %^[%d:%t %^] “%r” %s %b “%R” “%u”

At this point, we can save the file and exit the editor. We are now ready to run the GoAccess program and analyze some Apache plain-text log files.

Step 3 – Access Apache’s Log Files with GoAccess

The Apache server grants access to the website and keep an access log for all HTTP traffic. These records or log files are a valuable source of information about the website’s usage and audience.
On Ubuntu, the Apache log files are in the /var/log/apache2 directory by default.
To inspect the contents of this directory, we run:

$ sudo ls /var/log/apache2

Sample output

access.log error.log other_vhosts_access.log

If the server runs for a long time, we may find compressed .gz files in this directory containing past log files as a result of log rotation.
Then we run GoAccess against the Apache access logs to gain insight into the type of traffic is handled by the webserver.
We run the following command to analyze the access.log file with GoAccess:

$ sudo goaccess /var/log/apache2/access.log

This will launch the GoAccess command-line dashboard.
If we see a Log Format Configuration prompt instead, it means that the changes in the GoAccess config file are not taking effect.
Then we ensure that the config file is in the right place and have uncommented the necessary settings.
Sometimes we have several compressed log files on a long-running web server. To run GoAccess on all these files without extracting them first, we can pipe the output of the zcat command:

$ zcat /var/log/apache2/access.log.*.gz | goaccess -a

Now, let us check how to quickly navigate through the dashboard interface with keyboard shortcuts.
 

Step 4 – Navigate the Terminal Dashboard

Generally, at the top of the dashboard is a summary of several key metrics. This includes total requests for the reporting period, unique visitors, log size, 404 not found errors, and more.
Below the top panel, we will find all the available modules which provide more details on the aforementioned metrics and other data points supported by GoAccess.
To navigate the interface, we use the following keyboard shortcuts:
TAB to move forward through the available modules and SHIFT+TAB to move backward.
F5 to refresh the dashboard.
g to move to the top of the dashboard screen and G to move to the last item in the dashboard.
o or ENTER to expand the selected module.
j and k to scroll down and up within the active module.
s to display the sort options for the active module.
/ to search across all modules and n to move to the next match.
0-9 and SHIFT+0 to quickly activate the respective numbered module.
? to view the quick help dialog.
q to quit the program.

Moving ahead, let us examine each of the available modules on the dashboard.
Each one has a number and a title, and an indication of the total number of lines present. The > character indicates the active panel.

Here is a brief explanation by our Support Techs about each of the panels. Each section below corresponds to the panel number and title in the program.

1 – Unique Visitors per Day
This panel displays the hits, unique visitors, and cumulative bandwidth for each reported date. A unique visitor is one with the same IP address, date, and user-agent. It includes web crawlers and spiders by default.

2 – Requested Files (URLs)
It provides the statistics concerning the most highly requested non-static files on the webserver.

3 – Static Requests
Similarly, this panel provides the metrics as the previous one, but for static files such as images, CSS, JavaScript, or other file types.

4 – Not Found URLs (404s)
This panel also displays the same metrics, but for paths that were not found on the server (404s).

5 – Visitor Hostnames and IPs
This panel provides detailed information on the hosts that connect to the webserver. We can find their IP address, the number of visits, and the amount of bandwidth consumed. This is a great way to identify who is eating up all the bandwidth and block them if necessary.

6 – Operating Systems
This panel reports the different operating systems used by the hosts to connect to the webserver. Expanding this panel will display specific versions of each operating system.

7 – Browsers
Similar to the previous panel, this reports the browsers used by each unique visitor to your web server and lists specific versions for each browser once expanded.

8 – Time distribution
Here, we will find an hourly report for the number of hits, unique visitors, and bandwidth consumed. This is a great way to spot periods of peak traffic on the server.

9 – Virtual Hosts
This panel displays the virtual hosts parsed from the log file. It becomes active only if we include %v in the log-format configuration.

10 – Referrer URLs
The URLs that referred the visiting hosts to the web server are reflected here.
This panel is disabled by default and can only be enabled by commenting out the REFERRERS line highlighted following in the GoAccess config file /etc/goaccess/goaccess.conf:

#ignore-panel VISIT_TIMES
#ignore-panel VIRTUAL_HOSTS
#ignore-panel REFERRERS
#ignore-panel REFERRING_SITES

11 – Referring Sites
This panel displays the IP address of the referring hosts, but not the whole URL.

12 – Keyphrases
Here, the keywords used on Google search, Google cache, and Google translate that led to the website are reported.
This panel is also disabled by default and must be enabled in the settings:

#ignore-panel REFERRERS
#ignore-panel REFERRING_SITES
#ignore-panel KEYPHRASES
#ignore-panel STATUS_CODES

13 – HTTP Status Codes
This panel reflects the overall statistics for HTTP status codes returned by the webserver when responding to a request. Expanding the panel will display the aggregated stats for each status code.

14 – Remote User (HTTP Authentication)
This panel displays the user ID of the person requesting a document on the server as determined by HTTP authentication.

15 – Cache status
This panel allows to determine if a request is being cached and served from the cache. It is enabled if %C is part of the log-format variable, and the status could be MISS, BYPASS, EXPIRED, STALE, UPDATING, REVALIDATED, or HIT.

16 – Geo Location
This panel provides a summary of the geographical locations derived from visiting IP addresses. Expanding this panel will display the aggregated stats for each country of origin.

Step 5 – Generating Reports
Aside from displaying the data in the terminal, GoAccess also allows us to generate HTML, JSON, or CSV reports.
Make sure that we are in the home directory before running any of the commands in this section:

$ cd ~

To output the report as static HTML, specify an HTML file as the argument to the -o flag. This flag also accepts filenames that end in .json or .csv.

$ sudo goaccess /var/log/apache2/access.log -o stats.html

A stats.html file should appear in the user directory.

$ ls

Output

goaccess-1.4 goaccess-1.4.tar.gz snap stats.html

We can copy this file to the user directory on the local machine using SCP.
We run this command from the local machine, and not the remote server:

$ scp user@your_server_ip:stats.html ~/stats.html

Once the file copy over, we can open it in the browser with the open command on macOS:

$ open ~/stats.html

Or if we use Linux distribution on the local machine:

$ xdg-open ~/stats.html

We have generated an HTML report and viewed this in the browser.

[Need to install Packages in your Linux Server? We are available 24*7. ]

Conclusion

This article will guide you on how to #Install GoAccess on Ubuntu which is a command-line tool and how to use it for analyzing server #logs.
With #GoAccess, you are able to #SSH into any web server you control and view or analyze relevant statistics quickly and securely. Apart from the command-line dashboard interface, it's also capable of displaying the #statistics in other formats such as HTML, JSON, and CSV, which you can use in other contexts or share with others.




Keep In Touch

We hope to hear from you.

Accept File Type: jpg,jpeg,png,txt,pdf,doc,docx