--- title: "Customizing PTXQC reports" author: "Chris Bielow " date: '`r Sys.Date()`' output: html_document: default pdf_document: null vignette: > %\VignetteIndexEntry{Customizing PTXQC reports} %\VignetteEngine{knitr::rmarkdown} \usepackage[utf8]{inputenc} --- # Customizing PTXQC reports Reports rely on a small set of parameters, which do two things: - set optimal values for the PTXQC metrics and scoring functions (usually tighly connected to the data and MS instrument) - report length: number of metrics shown in the report or partial evaluation of the txt-folder The first point can partially be addressed via a **mqpar.xml file**. However, not all thresholds are known or even interesting to MaxQuant, e.g. the number of expected proteins. For this, PTXQC comes with a **YAML config file**. This file also addresses the report length. ## Automatically importing MaxQuant parameters via mqpar.xml Some settings in MaxQuant, e.g. the MS/MS search tolerance are important for PTXQC in order to correctly score the mass recalibration or show expected error margins. So how does PTXQC know about the settings used during the MaxQuant run? Copy the `mqpar.xml` which MaxQuant generated in the analysis directory into your txt folder, e.g. `c:\MQ_Diabetes_study\mqpar.xml` -> `c:\MQ_Diabetes_study\combined\txt\mqpar.xml`. This will enable PTXQC to extract some parameter settings and results in more accurate reports. You will see a warning message during report generation if `mqpar.xml` is missing. The FAQ vignette has additional information on why this manual copying is senseful. Alternatively, you can set the MaxQuant parameters manually in the YAML file (see next section) to match your MaxQuant configuration. ## More customization using the YAML config file A configuration file allows you to enable/disable certain plots, and to modify most target thresholds used for the scoring metrics and within plots, e.g. the number of proteins you expect. This will differ by platform and instrument; so adapt it to your needs. The defaults are meant for a Thermo Orbitrap Velos, a 4h LC gradient for a complex matrix (e.g. human cells). Disabling certain report sections will decrease report size and hide metrics which you may not need. The time required to create the report will also reduce, but usually this is not a bottleneck. PTXQC uses YAML format for its parameter configuration. YAML is short for 'YAML Aint Markup Language'; yes, it's an endless recursion. There is a nice tutorial for YAML at [https://rhnh.net/2011/01/31/yaml-tutorial] -- or just Google to find more. Also, any PTXQC YAML file will start with a comment section which explains the format. PTXQC will create a `report_*.yaml` file within your txt folder upon its first invocation. If you did not run PTXQC before on this folder, you can take any other YAML config file or run PTXQC with default settings (see above) to create such a file. Edit the YAML file as you desire using a text editor of your choice (e.g. Notepad++). **Each YAML file has a leading comment section which will describe the basics of how YAML looks like and what you should/not change.** > Note: PTXQC will only read the YAML file which **matches its own version exactly**, i.e. `report_v0.82.1.yaml` will be > ignored if PTXQC has version `0.83.1` to avoid compatibility problems. A new YAML file matching the current PTXQC > version will automatically be generated. If you want to use your old settings, rename the file manually. The YAML shown below is shortened and lacks parts of the introductory comment section. But in general, the YAML file will look like this: # This is a configuration file for PTXQC reporting. # One such file is generated automatically every time a report PDF is created. # You can make a copy of this file, then modify its values and use the copy as an input to another round of report generation, # e.g., to exclude/include certain plots or change some global settings. # # << more comment text in original YAML file ... >> # PTXQC: UseLocalMQPar: yes ReportFilename: extended: yes NameLengthMax_num: 10.0 AddPageNumbers: 'on' order: qcMetric_PAR: 1.0 qcMetric_PG_PCA: 3.0 qcMetric_EVD_Top5Cont: 10.0 qcMetric_PG_Ratio: 19.0 qcMetric_EVD_UserContaminant: 20.0 << more metrics here... >> File: Parameters: enabled: yes Summary: enabled: yes IDRate: Thresh_bad_num: 20.0 Thresh_great_num: 35.0 ProteinGroups: ... Evidence: enabled: yes ProteinCountThresh_num: 3500.0 PeptideCountThresh_num: 15000.0 MatchBetweenRuns_wA: auto To disable certain metrics/plots, just change the **order** of the metric in question to a **negative** value, e.g. `qcMetric_PG_PCA: -3.0`, will disable the PCA plot. This is helpful to disable uninteresting metrics, but also a temporary solution if your PC cannot cope with RAM demands of PTXQC or a metric is broken (you should open a bug report in this case). ## Rename Raw files in the report PTXQC will try to shorten Raw file names to make the plot axis as compact as possible, giving more area to show the data rather than overly long Raw file names. It will use prefix and infix removal of strings which are common to all Raw file names. Name your files in a systematic to make this approach as effective as possible. Should this shortening not reach a predefined threshold of 10 characters in length for the longest filename after shortening, PTXQC will simply use consecutive numbers, i.e. **file 01**, **file 02** to name your files on the axis of the plots. If you do not want that, you can either: - Increase this length threshold in the YAML config to force usage of the shortened versions. But be advised that using long filenames might result in ugly axis with overlapping text etc. - Use a manual mapping table: a file named `report_vXXX_filename_sort.txt` will be created in your txt-folder, which contains the file name mapping that PTXQC computed. You can edit this file with any text editor and upon running PTXQC again, this new mapping will be applied. ## Reordering Raw files in the report By default, the order of Raw files in the report is _identical_ to the order specified within MaxQuant. You already read about the `report_vXXX_filename_sort.txt` file above. It can not only be used to assign custom names, but also to re-arrange the order in which Raw files are shown in the plots and heatmap. Open it with a text editor and simply rearrange the rows in the file. Easy!