Ahmed T. Hammad
  • ℹ️ About
  • 🧑‍🏫 Teaching
  • 🛰 Research
  • 🧑‍🎓 Students
  • ✍ Papers
  • 💡Solutions
  • 🧰Toolkits
  • 📃 CV
  • 🎙️Blog
  • 📽️️ Slides
  • 📸 Gallery
  • Github
  • LinkedIn
  • Email

On this page

  • … Last time
  • pbox
  • Create a PBOX Object
  • Explore Probability Space
  • Confidence Intervals
  • Grid Search
  • Scenario Analysis

pbox: Exploring Multivariate Spaces with Probability Boxes

… Last time

In a previous post I introduced the idea of a “probability box.” Well, after several intense months of hard work, I am thrilled to announce that my idea has been transformed into a fully functional R library, now available on CRAN for everyone interested in answering probabilistic questions!

pbox

🌟 Introducing pbox! 🌟 an advanced statistical library offering a method to encapsulate and query the probability space of a dataset effortlessly using Probability Boxes (p-boxes). Its distinctive feature lies in the ease with which users can navigate and analyze marginal, joint, and conditional probabilities while taking into account the underlying correlation structure inherent in the data using copula theory and models. pbox can be utilized across various fields, including such as Environmental Analysis, Finance Risk Assessment and Management and more!

This is just the beginning. In future releases, I plan to add additional functionalities to enhance pbox even further. Your feedback and suggestions are invaluable to me. If you have any ideas or requests, please feel free to drop me a message or write it in an issue on the project’s repository.

Here is a little demo to showcase what can be achieved with few lines of code!

Remember to first install the packaged from CRAN.

install.packages("pbox")
library(pbox)

data("SEAex", package = "pbox")

Create a PBOX Object

We create a pbox object from the SEAex dataset using the set_pbox function.

# Set pbox
pbx <- set_pbox(SEAex)
It seems your data might not be stationary!
pbox object generated!
print(pbx)
Probabilistic Box Object of class pbox

||--General Overview--||
----------------
1)Data Structure
Number of Rows:  122 
Number of Columns:  4 

1.1)Variable Statistics:
         var   min   max     mean median
      <char> <num> <num>    <num>  <num>
1:  Malaysia 30.50 32.30 31.24344  31.20
2:  Thailand 33.20 37.30 35.10656  35.10
3:   Vietnam 30.90 32.90 31.63934  31.60
4: avgRegion 25.21 26.66 25.78951  25.73

----------------
2)Copula Summary:
Type: ellipCopula 
Normal copula, dim. d = 4 
Dimension:  4 
Parameters:
  rho.1   = 0.4922978
dispstr:  ex 

2.1)Copula margins:
[1] "RG"  "SN1" "RG"  "RG" 
2.2)Kendall correlation:
           Malaysia  Thailand   Vietnam avgRegion
Malaysia  1.0000000 0.1755378 0.3864290 0.5751234
Thailand  0.1755378 1.0000000 0.2246915 0.2472509
Vietnam   0.3864290 0.2246915 1.0000000 0.4424894
avgRegion 0.5751234 0.2472509 0.4424894 1.0000000

-------------------------------

Explore Probability Space

We can query the probabilistic space of the pbox object using the qpbox function. Below are examples of different types of queries.

# Marginal Distribution

qpbox(pbx, mj = "Malaysia:33")
        P 
0.9986981 
# Joint Distribution

qpbox(pbx, mj = "Malaysia:33 & Vietnam:34")
        P 
0.9981121 
# Conditional Distribution

qpbox(pbx, mj = "Vietnam:31", co = "avgRegion:26")
         P 
0.03647037 
#Conditional Distribution with Fixed Conditions

qpbox(pbx, mj = "Malaysia:33 & Vietnam:31", co = "avgRegion:26", fixed = TRUE)
       P 
0.976313 
#Joint Distribution with Mean Values

qpbox(pbx, mj = "mean:c(Vietnam,Thailand)", lower.tail = TRUE)
        P 
0.3803387 
# Joint Distribution with Median Values

qpbox(pbx, mj = "median:c(Vietnam, Thailand)", lower.tail = TRUE)
        P 
0.3597187 
# Joint Distribution with Specific Values

qpbox(pbx, mj = "Malaysia:33 & mean:c(Vietnam, Thailand)", lower.tail = TRUE)
        P 
0.3803302 
# Conditional Distribution with Mean Conditions

qpbox(pbx, mj = "Malaysia:33 & median:c(Vietnam,Thailand)", co = "mean:c(avgRegion)")
        P 
0.6329741 

Confidence Intervals

qpbox(pbx, mj = "Malaysia:33 & median:c(Vietnam,Thailand)", co = "mean:c(avgRegion)", CI = TRUE, fixed = TRUE)
        P      2.5%     97.5% 
0.6557157 0.5662971 0.7545044 

Grid Search

We can perform a grid search to explore the probabilistic space over a grid of values.

grid_results <- grid_pbox(pbx, mj = c("Vietnam", "Malaysia"))
print(grid_results)
     Vietnam Malaysia        probs
       <num>    <num>       <list>
  1:    30.9     30.5 0.0001462783
  2:    31.2     30.5 0.0004897392
  3:    31.3     30.5  0.000556562
  4:    31.4     30.5 0.0005973167
  5:    31.5     30.5 0.0006203644
 ---                              
117:    31.7     32.3    0.6206133
118:    31.8     32.3    0.6980325
119:    32.0     32.3     0.813852
120:    32.3     32.3    0.9109836
121:    32.9     32.3    0.9727657
print(grid_results[which.max(grid_results$probs),])
   Vietnam Malaysia     probs
     <num>    <num>    <list>
1:    32.9     32.3 0.9727657
print(grid_results[which.min(grid_results$probs),])
   Vietnam Malaysia        probs
     <num>    <num>       <list>
1:    30.9     30.5 0.0001462783

Scenario Analysis

We perform scenario analysis by modifying underlying parameters of the pbox object.

scenario_results <- scenario_pbox(pbx, mj = "Vietnam:31 & avgRegion:26", param_list = list(Vietnam = "mu"))
print(scenario_results)
$`SD-3`
         P 
0.09640711 

$`SD-2`
         P 
0.06788253 

$`SD-1`
         P 
0.04519266 

$SD0
         P 
0.02820379 

$SD1
         P 
0.01633734 

$SD2
          P 
0.008684461 

$SD3
          P 
0.004181092