Download SummerStudentReport-JosipDomsic

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Concurrency control wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Relational model wikipedia , lookup

Database wikipedia , lookup

Functional Database Model wikipedia , lookup

ContactPoint wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Transcript
Tracing and accounting
of physical resources
in the computer centre
August 2015
Author:
Josip Domšić
Supervisor:
Ulrich Schwickerath
CERN openlab Summer Student Report 2015
CERN openlab Summer Student Report
2015
Project Specification
CERN is going for a large-scale virtualization of its more than 10’000 servers and all hypervisors
and most virtual machines are centrally managed. However there are some legitimate use cases,
which cannot be covered by this scheme. Therefore, service managers have the possibility to
request physical resources, which will then be owned by them. Central management of these
resources is not strictly required. In order to trace and properly account for these resources, the
information from different sources needs to be combined. The aim of the project is to develop the
required tools, ensure regular running, and feed the data into the relevant accounting repositories.
CERN openlab Summer Student Report
2015
Abstract
Within the project “tracing and accounting of physical resources in CERN data centre” a fully
automated way of collecting information about physical resources has been designed and
implemented. The implementation is split into three phases. In the first phase the information is
collected from a number of different databases: hardware, network, elastic search, puppet, and
foreman. In the second phase the collected information is stored into a Django database. The third
part is a web application aggregates and displays the accumulated data.
.
CERN openlab Summer Student Report
2015
Table of Contents
Project Specification........................................................................................................................ 2
Abstract ........................................................................................................................................... 3
Table of Contents ............................................................................................................................ 4
1.
Introduction ............................................................................................................................. 5
2.
Technologies ........................................................................................................................... 5
3.
Design ...................................................................................................................................... 6
1.
Generate daily report ........................................................................................................... 7
Hardware database................................................................................................................... 7
Network database .................................................................................................................... 7
Puppet and Foreman ................................................................................................................ 7
Generating report ..................................................................................................................... 8
2.
Accumulate daily data ......................................................................................................... 8
3.
Hosting application data ...................................................................................................... 9
4.
Usage ..................................................................................................................................... 11
5.
Conclusion ............................................................................................................................. 11
CERN openlab Summer Student Report
2015
1. Introduction
The IT department is an organization responsible for managing computer resources at CERN,
European Organization for Nuclear Research. CERN is offering computer infrastructure to
scientists to do different experiments, number crunching, validation of hypothesis, etc. All
mentioned tasks are performed within the CERN computer infrastructure which utilises
databases, job schedulers, message queues and virtualization of hardware resources. To perform
those tasks efficiently, the IT department is divided into different groups: databases, network,
cloud, etc.
The computing needs increase with time. Therefore, existing equipment is regularly being
replaced and/or renewed, and the capacity is being increased regularly. The procurement of new
resources roughly has three steps:

Order new hardware

Run tests and rate the new hardware

Appoint new machine to specific tasks and organizational groups
A resource can be appointed to either experiments, scientists directly or the IT department. After
some time, resource can change owner, tasks and/or configuration. In addition, manually maintain
track of this process bears the risk for errors, resulting in a sub-optimal resource usage. The goal
of this project is to generate daily reports, automatize statistics of all the CERN hardware
resources and prevent unwanted events.
2. Technologies
Technologies used in the project are concentrated around Python programming language and the
web framework Django, Linux tool cron job, and different types of relational databases.
Python is an object oriented scripting programing language. It offers simple means to deal with
raw text and JSON documents. With an addition of Django web framework, and its object
management and native handing of SQL databases, Python becomes powerful tool for building
web pages.
Linux cron job tool offer simple infrastructure to run specific, periodic tasks at a given time.
CERN openlab Summer Student Report
2015
3. Design
The project follows the idea of micro services. Micro services are infrastructural pattern for
developing large applications. Large applications are divided into smaller programs called
services. All services are developed, tested and ran separately, but they should support some
interface for communication.
Infrastructure for collecting information about hardware resources in CERN data centre is divided
into 3 parts:

Generate daily report

Accumulate daily data, per owner and current user of machine

Host statistical data on a server
The process can be seen in the bottom picture, and is explained in detail in following chapters.
CERN openlab Summer Student Report
2015
1. Generate daily report
Hardware database
Generator of daily raw data is run each day and collects data from 5 different databases.
Hardware database supplies the starting point: serial number. Together with serial number,
generator collects hostname, number of physical and logical cores and HEP spec06 ratings. HEP
spec06 rating is a CERN standard way of comparing machines' specifications. If something
unusual happens during calculating HEP spec06 ratings or acquiring number of physical or
logical cores, generator deals with it in a following manner:

If number of physical cores is missing, it is set to 0 and a warning is printed:
“[WARNING] Number of physical cores is missing for #Serial Number”.

If number of logical cores is missing, it is set to number of physical cores

If HEP spec06 ratings is missing, or set to 0, it is set to logical cores times
number_from_configuration
Network database
Next step is collecting the information from network database. The data is collected by a serial
number:

Owner of the machine

Current user of the machine
If current user is missing, it's set to owner. If serial number found in network database is missing
in current report (missing in hardware database), an appropriate warning is printed out:
“[NETWORK] Serial number #serial_number not in hardware database. Device name
#device_name”.
Puppet and Foreman
Next step is collecting management flags: Puppet and Foreman. All entries from Puppet and
Foreman databases are compared to the current report. If there is a match between hostnames, an
appropriate flag is set. E.g. If a hostname from network database is present in Puppet database,
flag is_puppet is set to true.
Additional to the management flags, flags “SPARE” and “INCOMING” are generated from
machines’ host group field.
Additionally, if some machines are present in PuppetDB or ForemanDB, but not in HardwareDB,
they are printed to error file as:
[PUPPET] Device missing #device_name
[FOREMAN] Device missing #device_name
CERN openlab Summer Student Report
2015
Generating report
All connected data is grouped into a JSON file. Filename is created from today’s date in a
following pattern: %yyyy-%mm-%dd.json. An appropriate error file has the same name but
different ending: error. Reports are stored into directory “/var/reports/hardware_resources”.
From collected flags, 3 additional flags are generated for each device:

UNMANAGED flag

STALE flag

NOT_IN_PRODUCTION flag
The data is grouped by current owner department and owner groups in the following manner:
owner_department:
owner_group:
user_group – user_department: [information about devices]
....
....
....
Example:
A:
A1:
A-A1: ... (INFO)
D-D9: ...
A2: ...
B:
....
....
Suggestion for reading this raw data would be:


cat report.json | python -m json.tool | grep [expression]
cat report | jgrep [expression]
2. Accumulate daily data
The accumulation of daily data is really done in previously explained generator. Accumulation
can be then referred to as saving raw data into the Django web app (Sqlite database). Script is
called accumulate.py and is run every day at 6 am.
It reads raw data from directory mentioned into configuration file that is passed as first parameter
to script. Second parameter is optional and it's a filename / data that wish to be imported into
database. Example:
accumulate.py configuration.conf --date 2015-08-21
CERN openlab Summer Student Report
2015
Structure of data that is being saved is:
not_in_production
stale_machines
unmanaged_machines
number_of_machines
hepspec_ratings
logical_cores
physical_cores
user
owner_group
owner_department
Table name is hardware_resources, in a database mentioned in Django's settings.py
3. Hosting application data
Let’s assume the application has been deployed on a server named “example.cern.ch”.
Additional GET flags are hw_details and hw_group_details.
?hw_details=[DEPARTMENT] flag
DEPARTMENT ( E.g. IT, PH , ...)
shows
all
group
statistics
in
the
specified
?hw_group_details=[GROUP_NAME] flag shows current_user statistics in the specified
DEPARTMENT and with the specific GROUP_NAME (E.g. PES, DB, ... )
Example:
https://example.cern.ch/accounting?hw_details=A&hw_group_details=A1
Picture below shows example statistics for the 1st of August 2015.
CERN openlab Summer Student Report
2015
The fields in the table are as follows:

Department name

Department group

Current user (department-group)

Number of machines

Number of physical cores

Number of logical cores

Total HEP spec06 ratings (divided by 1000)

Number of unmanaged machines and (in brackets) percentage of unmanaged machines
(compared to number of machines)

Number of stale machines and (in brackets) percentage of stale machines (compared to
number of machines)

Number of machines that are not in production and (in brackets) percentage of those
machines (compared to total HEP spec06)
CERN openlab Summer Student Report
2015
4. Usage
To generate useful data in an accounting web application, one needs to:
1. Run a generator, in an cron job or manually

generate_report.py path/to/file.conf
2. Move the data from the raw JSON to database

accumulate.py path/to/file.conf [--date %Y-%m-%d]
3. Host an accounting application and make a HTTP query

http://example.cern.ch/accounting?hw_details=A&hw_group_details=A2
5. Conclusion
Utilizing micro services, as an infrastructural pattern for building larger application, in this
project we have created a way for easier change in the future.
This project created a fully automated way to monitor usage of hardware resources. Monitoring
of hardware resources can now be done on a daily basis and all unwanted events are minimized.