We speculate that this is achieved either by an increased synthesis or by an increased transport

ard DDIs are generated, only the highest score is retained to represent the maximum similarity against the well-known DDIs. The resultant matrix is not symmetric, for which a symmetric transformation is carried out retaining the maximum value in each symmetric cell. That way, each cell in the final M3 matrix represents the drug pair DDI candidate with the maximum similarity score regarding to a DDI drug pair deemed as true positive in our reference standard. DDIs from the M3 matrix are listed with their corresponding similarity scores. DDIs belonging to the matrix diagonal and representing drugs interacting with themselves are eliminated. Although our models are based on the maximum similarity score, the method allows the implementation of alternative algorithms. MedChemExpress UNC0642 Pharmacovigilance data: TWOSIDES database We downloaded the TWOSIDES database, a data source of DDIs extracted from mining FAERS. We collected 13,105 DDIs related to the terms arrhythmia and bradyarrhythmia with proportional reporting ratio >1 and p-value <.05. These data were mapped to our initial DDI reference standard to find the DDIs in common. We retained 386 DDIs present in both databases: 14 positives and 372 negatives. The subset of final DDIs was sorted by PRR and p-value and by the different similarity-based models. Combination of similarity-based modeling We constructed different complex models combining the M3 similarity-based scorings for the 386 cases analyzed in TWOSIDES. We used Principal Component Analysis and Linear Discriminant Analysis. Through PCA the six M3 scorings were transformed into a unique component explaining the 66.4% PubMed ID:http://www.ncbi.nlm.nih.gov/pubmed/19705642 of the variance. The percentage of the variance explained by each additional factor is provided in S1 Fig On the other hand, we trained a LDA model with 14 positives and 372 negative cases. Five variables were introduced in the model using the forward-stepwise method: 2D, 3D, TPF, DDIPF and ATC scores. Statistical quality of the model was assessed through parameters such as Wilks’ statistic, Fisher ratio = 14.8) and the significance level. S1 Fig also provides the AUROC results of LDA including from 1 to 5 variables in the model. Assessment of the performances We measured the enrichment factor detected in TWOSIDES as the ratio between the prevalence detected in TWOSIDES and the prevalence in the initial reference standard. 6 / 17 Improving Detection of Drug-Drug Interactions in Pharmacovigilance Prevalence is defined as the proportion of known well-established DDIs between all the DDI candidate cases. We also ranked the DDIs according to proportional reporting ratio and p-value, provided by the TWOSIDES, and according to our different similarity DDI models and assessed the precision in different top positions. Precision or positive predictive value was calculated as the ratio between true positives and all positive cases, true positives plus false positives. For the comparison of the performances we also used areas under the receiver operating characteristic curves. If the area under the curve is 0.5 the classifier is random whereas a perfect classifier will yield an area of 1. ROC curves were also plotted PubMed ID:http://www.ncbi.nlm.nih.gov/pubmed/19703425 showing the true positive fraction against the false positive fraction. We performed an external evaluation using reference standard data sources, such as Drugdex and Drugs.com, to deem the rest of 372 candidates as positive and negative cases. Results Performance in TWOSIDES using the initial reference standard We mapped our initi