infra-docs-fpo/modules/sysadmin_guide/pages/retrace.adoc

129 lines
3.4 KiB
Text
Raw Normal View History

= Retrace SOP
Retrace server - provides complete tracebacks for unhandled crashes and
show aggregated information for developers.
== Contact Information
Owner:::
Fedora QA Devel, Fedora Infrastructure Team, ABRT team
Contact:::
#abrt, #fedora-admin, #fedora-noc
Servers:::
retrace*, faf*
Purpose:::
Provides complete tracebacks for unhandled crashes and show aggregated
information for developers.
== Description
The physical server (_retrace.fedoraproject.org_) runs two main services:
_retrace-server_ and _FAF_.
=== Retrace-server
The upstream for retrace server lives at:
____
https://github.com/abrt/retrace-server
____
When a user has the ABRT client installed and a process crashes with an
unhandled exception (e.g., traceback or core dump), the user can send a
request to _retrace-server_. The server will install the same set of
packages plus debuginfo, and will return a traceback to the user that
includes function names instead of plain pointers. This information is
useful for debugging.
The upstream _retrace-server_ allows users to upload coredumps through a
web interface, but the Fedora instance disables this feature.
=== FAF
When a user decides to report a crash, data is sent to _FAF_. ABRT can
also be configured to send microreports automatically, if desired.
_FAF_ can aggregate similar reports into one entity (called a Problem).
_FAF_ provides a nice web interface for developers, allowing them to see
crashes of their packages. It lives at:
____
https://retrace.fedoraproject.org/faf/
____
== Playbook
The playbook is split into several roles. There are two main roles
* abrt/faf
* abrt/retrace
These roles are copied from upstream. You should never update it
directly. The new version can be fetched from upstream using:
....
cd ansible/abrt
rm -rf faf retrace
ansible-galaxy install -f -r requirements.yml --ignore-errors -p ./
....
You should review the new differences, and commit and push.
Then there are some roles which are local for our instance:
____
* abrt/faf-local - This is run _before_ abrt/faf.
* abrt/retrace-local - This is run _after_ abrt/retrace.
* abrt/retrace-local-pre - This is run _before_ abrt/retrace.
____
== Services
_FAF_ and _retrace-server_ are web applications; only httpd is required.
== Cron
_FAF_ and _retrace-server_ each have cron tasks. They are _not_ installed
under `/etc/cron*` but are installed as user cron jobs for the 'faf' and
'retrace' users.
You can list those crons using:
....
sudo -u faf crontab -l
sudo -u retrace crontab -l
....
All cronjobs should be Ansible managed. Just make sure if you delete
some cron from Ansible that it does not remain on the server (not always
possible with state=absent).
== Directories
* `/srv/ssd` - fast disk, used for PostgreSQL storage
* `/srv` - big fat disk, used for storing packages. Mainly:
** `/srv/faf/lob`
** `/srv/retrace`
* `/srv/faf/db-backup/` - Daily backups of DB. No rotating yet. Needs to be
manually deleted occasionally.
* `/srv/faf/lob/InvalidUReport/` - Invalid reports, can be pretty big.
No automatic removal too. Need to be purged manually occasionally.
== Front-page
The main web page is handled by the _abrt-server-info-page_
package, which can be controlled using:
....
/usr/lib/python2.7/site-packages/abrt-server-info-page/config.py
....
== DB
Only _FAF_ uses a database. We use our own instance of PostgreSQL. You can
connect to it using:
....
sudo -u faf psql faf
....