Home¶

Understanding personal protective behaviours and opportunities for interventions:¶

Results from a multi-method investigation of cross-sectional data¶

Kaisa Saurio, James Twose, Gjalt-Jorn Peters, Matti Heino & Nelli Hankonen¶

approaches used here: PCIs, Linear Regression, Correlations¶

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import session_info

from sklearn.metrics import mean_squared_error
from sklearn.model_selection import KFold, GroupKFold, GroupShuffleSplit, RepeatedStratifiedKFold, RepeatedKFold
In [2]:
from sklearn.model_selection import train_test_split, GridSearchCV
from scipy import stats
In [3]:
from sklearn.model_selection import RepeatedStratifiedKFold, cross_val_score, cross_validate
from sklearn.linear_model import LinearRegression
In [4]:
from sklearn.model_selection import KFold
In [5]:
import statsmodels.api as sm
In [6]:
from jmspack.frequentist_statistics import (correlation_analysis,
                                            potential_for_change_index,
                                            multiple_univariate_OLSs
                                           )
from jmspack.utils import (flatten,
                           apply_scaling,
                           JmsColors
                          )
In [7]:
if "jms_style_sheet" in plt.style.available:
    plt.style.use("jms_style_sheet")

# _ = sns.set_style("whitegrid")

Virtual Environments and Packages¶

In [8]:
session_info.show(req_file_name="corona_preppers-requirements.txt",
      write_req_file=False) #add write_req_file=True to function to get requirements.txt file of packages used
Out[8]:
Click to view session information
-----
jmspack             0.1.1
matplotlib          3.5.1
numpy               1.21.5
pandas              1.4.2
scipy               1.7.3
seaborn             0.11.2
session_info        1.0.0
sklearn             1.0.2
statsmodels         0.13.2
-----
Click to view modules imported as dependencies
PIL                         9.0.1
appnope                     0.1.2
asttokens                   NA
backcall                    0.2.0
beta_ufunc                  NA
binom_ufunc                 NA
bottleneck                  1.3.4
cffi                        1.15.0
colorama                    0.4.4
cycler                      0.10.0
cython_runtime              NA
dateutil                    2.8.2
debugpy                     1.5.1
decorator                   5.1.1
defusedxml                  0.7.1
entrypoints                 0.4
executing                   0.8.3
ipykernel                   6.9.1
ipython_genutils            0.2.0
jedi                        0.18.1
joblib                      1.1.0
jupyter_server              1.13.5
kiwisolver                  1.3.1
matplotlib_inline           NA
mkl                         2.4.0
mpl_toolkits                NA
nbinom_ufunc                NA
numexpr                     2.8.1
packaging                   21.3
parso                       0.8.3
patsy                       0.5.2
pexpect                     4.8.0
pickleshare                 0.7.5
pkg_resources               NA
prompt_toolkit              3.0.20
ptyprocess                  0.7.0
pure_eval                   0.2.2
pydev_ipython               NA
pydevconsole                NA
pydevd                      2.6.0
pydevd_concurrency_analyser NA
pydevd_file_utils           NA
pydevd_plugins              NA
pydevd_tracing              NA
pygments                    2.11.2
pyparsing                   3.0.4
pytz                        2021.3
setuptools                  61.2.0
six                         1.16.0
stack_data                  0.2.0
threadpoolctl               2.2.0
tornado                     6.1
traitlets                   5.1.1
typing_extensions           NA
wcwidth                     0.2.5
zmq                         22.3.0
-----
IPython             8.2.0
jupyter_client      7.2.2
jupyter_core        4.9.2
jupyterlab          3.3.2
notebook            6.4.8
-----
Python 3.10.4 (main, Mar 31 2022, 03:38:35) [Clang 12.0.0 ]
macOS-10.16-x86_64-i386-64bit
-----
Session information updated at 2022-06-11 12:00

Read in data, show info and data head¶

In [9]:
df = pd.read_csv("data/shield_gjames_21-09-20_prepped.csv").drop("Unnamed: 0", axis=1)
In [10]:
df.head()
Out[10]:
id sampling_weight demographic_gender demographic_age demographic_4_areas demographic_8_areas demographic_higher_education behaviour_indoors_nonhouseholders behaviour_close_contact behaviour_quarantined ... intention_public_transport_recoded intention_indoor_meeting_recoded intention_restaurant_recoded intention_pa_recoded intention_composite behaviour_indoors_nonhouseholders_recoded behaviour_unmasked_recoded behavior_composite behavior_composite_recoded intention_behavior_composite
0 1 2.060959 2 60+ 2 7 0 2 5 2 ... 0 0 0 0 0 1.000000 0.000000 0.000000 0.000000 0.000000
1 2 1.784139 2 40-49 1 1 1 3 3 2 ... 0 1 1 1 3 0.785714 0.214286 0.168367 0.841837 1.920918
2 3 1.204000 1 60+ 1 2 1 4 4 2 ... 0 0 0 0 0 0.500000 0.214286 0.107143 0.535714 0.267857
3 4 2.232220 1 60+ 2 6 0 4 3 2 ... 0 2 0 2 4 0.500000 0.500000 0.250000 1.250000 2.625000
4 5 1.627940 2 18-29 1 3 0 6 3 2 ... 0 2 0 0 2 0.000000 0.214286 0.000000 0.000000 1.000000

5 rows × 106 columns

In [11]:
sdt_columns = df.filter(regex="sdt").columns.tolist()
In [12]:
drop_sdt = True
if drop_sdt:
    df=df.drop(sdt_columns, axis=1)
In [13]:
df.shape
Out[13]:
(2272, 87)

Specify the feature list, grouping variable, and specify the grouping variable as a categorical variable¶

In [14]:
target = "intention_behavior_composite"
In [15]:
df[target] = (df[target] - 10) * -1
In [16]:
features_list = df.filter(regex="^automaticity|attitude|^norms|^risk|^effective").columns.tolist()
In [17]:
meta_columns = ['Original position', 'Variable name', 'Label',
       'Item english translation ', 'Label short', 'Type', 'New variable name',
       'variable name helper',
       'Of primary interest as a predictor (i.e. feature)?', 'English lo-anchor',
       'English hi-anchor']
In [18]:
sheet_id = "1BEX4W8XRGnuDk4Asa_pdKij3EIZBvhSPqHxFrDjM07k"
sheet_name = "Variable_names"
url = f"https://docs.google.com/spreadsheets/d/{sheet_id}/gviz/tq?tqx=out:csv&sheet={sheet_name}"
meta_df = pd.read_csv(url).loc[:, meta_columns]
In [19]:
meta_list = df.filter(regex="^automaticity|attitude|^norms|^risk|^effective|^behaviour|^intention").columns.tolist()
In [20]:
pd.set_option("display.max_colwidth", 350)
pd.set_option('display.expand_frame_repr', True)
meta_df.loc[meta_df["New variable name"].isin(meta_list), ["Item english translation ", "New variable name"]]
Out[20]:
Item english translation New variable name
12 How often in the last 7 days have you been indoors with people outside your household so that it is not related to obligations? For example, meeting friends, visiting hobbies, non-essential shopping, or other activities that are not required for your work or other duties.\n behaviour_indoors_nonhouseholders
13 In the last 7 days, have you been in close contact with people outside your household? Direct contact means spending more than one minute less than two meters away from another person or touching (e.g., shaking hands) outdoors or indoors. behaviour_close_contact
14 Are you currently in quarantine or isolation due to an official instruction or order? (For example, because you are waiting for a corona test, have returned from abroad or been exposed to a coronavirus) behaviour_quarantined
15 How often in the last 7 days were you in your free time without a mask indoors with people you don’t live with? behaviour_unmasked
24 If in the next 7 days you go to visit the following indoor spaces and there are people outside your household, Are you going to wear a mask? Grocery store or other store\n intention_store
25 If in the next 7 days you go to visit the following indoor spaces and there are people outside your household, Are you going to wear a mask? Bus, train or other means of public transport intention_public_transport
26 If in the next 7 days you go to visit the following indoor spaces and there are people outside your household, Are you going to wear a mask? Meeting people outside your household indoors intention_indoor_meeting
27 If in the next 7 days you go to visit the following indoor spaces and there are people outside your household, Are you going to wear a mask? Cafe, restaurant or bar indoors intention_restaurant
28 If in the next 7 days you go to visit the following indoor spaces and there are people outside your household, Are you going to wear a mask? Indoor exercise intention_pa
29 Taking a mask with you to a store or public transport, for example, has already become automatic for some and is done without thinking. For others, taking a mask with them is not automatic at all, but requires conscious thinking and effort. automaticity_carry_mask
30 Putting on a mask, for example in a shop or on public transport, has already become automatic for some and it happens without thinking. For others, putting on a mask is not automatic at all, but requires conscious thinking and effort. automaticity_put_on_mask
32 What consequences do you think it has if you use a face mask in your free time? If or when I use a face mask… inst_attitude_protects_self
33 What consequences do you think it has if you use a face mask in your free time? If or when I use a face mask… inst_attitude_protects_others
34 What consequences do you think it has if you use a face mask in your free time? If or when I use a face mask… inst_attitude_sense_of_community
35 What consequences do you think it has if you use a face mask in your free time? If or when I use a face mask… inst_attitude_enough_oxygen
36 What consequences do you think it has if you use a face mask in your free time? If or when I use a face mask… inst_attitude_no_needless_waste
37 Who thinks you should use a face mask and who thinks not? In the following questions, by using a face mask, we mean holding a cloth or disposable face mask, surgical mask, or respirator on the face so that it covers the nose and mouth. The questions concern leisure time. My family and friends think I should .. \n norms_family_friends
38 People at risk think I should .. norms_risk_groups
39 The authorities think I should .. norms_officials
40 In the indoors spaces I visit, people on the site think I should… norms_people_present_indoors
41 When I use a face mask, I feel or would feel ... aff_attitude_comfortable
42 When I use a face mask, I feel or would feel ... aff_attitude_calm
43 When I use a face mask, I feel or would feel ... aff_attitude_safe
44 When I use a face mask, I feel or would feel ... aff_attitude_responsible
45 When I use a face mask, I feel or would feel ... aff_attitude_difficult_breathing
61 If two unvaccinated people from different households meet indoors, what means do you think would be effective in preventing coronavirus infection? Hand washing and use of gloves effective_means_handwashing
62 Using a face mask effective_means_masks
63 Keeping a safety distance (2 meters) effective_means_distance
64 Ventilation effective_means_ventilation
65 How likely do you think you will get a coronavirus infection in your free time in the next month? risk_likely_contagion
66 How likely do you think you would get a coronavirus infection in your free time in the next month if you did nothing to protect yourself from it?\r risk_contagion_absent_protection
67 If you got a coronavirus infection, how serious a threat would you rate it to your health?\r risk_severity
68 Spread of coronavirus… risk_fear_spread
69 The fact that I would get infected myself .. risk_fear_contagion_self
70 That my loved one would get infected... risk_fear_contagion_others
71 Consequences of measures taken to prevent the spread of the coronavirus... risk_fear_restrictions
In [21]:
pd.set_option("display.max_colwidth", 100)

EDA on the target¶

Check the amount of samples in the target

In [22]:
_ = sns.violinplot(data=df[[target]].melt(), 
                    x="variable", 
                    y="value"
               )
_ = sns.stripplot(data=df[[target]].melt(), 
                    x="variable", 
                    y="value",
                  edgecolor='white',
                  linewidth=0.5
               )
In [23]:
pd.crosstab(df["demographic_gender"], df["demographic_age"])
Out[23]:
demographic_age 18-29 30-39 40-49 50-59 60+
demographic_gender
1 114 169 187 168 337
2 281 185 229 211 391
In [24]:
target_df = df[target]
target_df.describe().to_frame().T
Out[24]:
count mean std min 25% 50% 75% max
intention_behavior_composite 2272.0 8.582428 1.524704 -0.0 8.017857 8.964286 9.5 10.0
In [25]:
_ = plt.figure(figsize=(20, 5))
_ = sns.countplot(x=target_df)
_ = plt.xticks(rotation=90)
In [26]:
df = (df[["demographic_age", "demographic_higher_education"] + features_list + [target]])
In [27]:
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2272 entries, 0 to 2271
Data columns (total 30 columns):
 #   Column                            Non-Null Count  Dtype  
---  ------                            --------------  -----  
 0   demographic_age                   2272 non-null   object 
 1   demographic_higher_education      2272 non-null   int64  
 2   automaticity_carry_mask           2272 non-null   int64  
 3   automaticity_put_on_mask          2272 non-null   int64  
 4   inst_attitude_protects_self       2272 non-null   int64  
 5   inst_attitude_protects_others     2272 non-null   int64  
 6   inst_attitude_sense_of_community  2272 non-null   int64  
 7   inst_attitude_enough_oxygen       2272 non-null   int64  
 8   inst_attitude_no_needless_waste   2272 non-null   int64  
 9   norms_family_friends              2272 non-null   int64  
 10  norms_risk_groups                 2272 non-null   int64  
 11  norms_officials                   2272 non-null   int64  
 12  norms_people_present_indoors      2272 non-null   int64  
 13  aff_attitude_comfortable          2272 non-null   int64  
 14  aff_attitude_calm                 2272 non-null   int64  
 15  aff_attitude_safe                 2272 non-null   int64  
 16  aff_attitude_responsible          2272 non-null   int64  
 17  aff_attitude_difficult_breathing  2272 non-null   int64  
 18  effective_means_handwashing       2272 non-null   int64  
 19  effective_means_masks             2272 non-null   int64  
 20  effective_means_distance          2272 non-null   int64  
 21  effective_means_ventilation       2272 non-null   int64  
 22  risk_likely_contagion             2272 non-null   int64  
 23  risk_contagion_absent_protection  2272 non-null   int64  
 24  risk_severity                     2272 non-null   int64  
 25  risk_fear_spread                  2272 non-null   int64  
 26  risk_fear_contagion_self          2272 non-null   int64  
 27  risk_fear_contagion_others        2272 non-null   int64  
 28  risk_fear_restrictions            2272 non-null   int64  
 29  intention_behavior_composite      2272 non-null   float64
dtypes: float64(1), int64(28), object(1)
memory usage: 532.6+ KB
In [28]:
display(df[target].value_counts().head().to_frame()), df.shape[0], df[target].value_counts().head().sum()
intention_behavior_composite
10.000000 424
9.500000 228
9.000000 187
8.885204 155
9.385204 112
Out[28]:
(None, 2272, 1106)

Correlations between features and target¶

In [29]:
dict_results = correlation_analysis(df, 
                                    row_list=[target],
                                    col_list=features_list,
                                    # method='pearson', 
                                    check_norm=True,
                                    dropna='pairwise')
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
In [30]:
corrs_df = dict_results["summary"].sort_values(by="r-value", ascending=False)
corrs_df.head(5)
Out[30]:
analysis feature1 feature2 r-value p-value stat-sign N
10 Spearman Rank norms_people_present_indoors intention_behavior_composite 0.417490 1.573553e-96 True 2272
17 Spearman Rank effective_means_masks intention_behavior_composite 0.398325 2.909620e-87 True 2272
0 Spearman Rank automaticity_carry_mask intention_behavior_composite 0.373645 3.383126e-76 True 2272
22 Spearman Rank risk_severity intention_behavior_composite 0.367338 1.609304e-73 True 2272
7 Spearman Rank norms_family_friends intention_behavior_composite 0.365861 6.684440e-73 True 2272
In [31]:
_ = sns.boxplot(data=corrs_df[["r-value", "p-value"]].melt(),
                x="variable", y="value")
_ = plt.axhline(y=0.05, c="grey", ls="--")
In [32]:
neg_corrs_features = corrs_df[corrs_df["r-value"] < 0].feature1.tolist()
In [33]:
neg_corrs_features
Out[33]:
['aff_attitude_difficult_breathing', 'risk_fear_restrictions']
In [34]:
for feature in neg_corrs_features:
    _ = sns.lmplot(data=df, 
               x=target, 
               y=feature, 
               hue="demographic_age",
              legend=True)

Multivariate Linear Regression¶

In [35]:
X = df[features_list]
y = df[target]
In [36]:
mod = sm.OLS(endog=y, exog=X)
res = mod.fit()
display(res.summary())
OLS Regression Results
Dep. Variable: intention_behavior_composite R-squared (uncentered): 0.980
Model: OLS Adj. R-squared (uncentered): 0.980
Method: Least Squares F-statistic: 4050.
Date: Sat, 11 Jun 2022 Prob (F-statistic): 0.00
Time: 12:00:17 Log-Likelihood: -3705.8
No. Observations: 2272 AIC: 7466.
Df Residuals: 2245 BIC: 7620.
Df Model: 27
Covariance Type: nonrobust
coef std err t P>|t| [0.025 0.975]
automaticity_carry_mask 0.0923 0.031 2.933 0.003 0.031 0.154
automaticity_put_on_mask 0.0787 0.034 2.331 0.020 0.012 0.145
inst_attitude_protects_self -0.0120 0.024 -0.491 0.624 -0.060 0.036
inst_attitude_protects_others 0.2237 0.032 6.901 0.000 0.160 0.287
inst_attitude_sense_of_community -0.0035 0.020 -0.173 0.863 -0.043 0.036
inst_attitude_enough_oxygen 0.1177 0.021 5.668 0.000 0.077 0.158
inst_attitude_no_needless_waste 0.0190 0.016 1.172 0.241 -0.013 0.051
norms_family_friends 0.0758 0.027 2.840 0.005 0.023 0.128
norms_risk_groups -4.061e-05 0.034 -0.001 0.999 -0.067 0.067
norms_officials 0.1520 0.032 4.817 0.000 0.090 0.214
norms_people_present_indoors 0.1691 0.024 7.160 0.000 0.123 0.215
aff_attitude_comfortable 0.0224 0.029 0.778 0.437 -0.034 0.079
aff_attitude_calm 0.0316 0.027 1.192 0.233 -0.020 0.084
aff_attitude_safe 0.0213 0.031 0.688 0.492 -0.039 0.082
aff_attitude_responsible 0.0120 0.033 0.363 0.717 -0.053 0.077
aff_attitude_difficult_breathing 0.1773 0.019 9.375 0.000 0.140 0.214
effective_means_handwashing 0.0465 0.021 2.170 0.030 0.004 0.089
effective_means_masks 0.0202 0.028 0.721 0.471 -0.035 0.075
effective_means_distance 0.1063 0.026 4.127 0.000 0.056 0.157
effective_means_ventilation 0.0497 0.020 2.507 0.012 0.011 0.089
risk_likely_contagion 0.0551 0.023 2.344 0.019 0.009 0.101
risk_contagion_absent_protection 0.0012 0.020 0.062 0.951 -0.038 0.040
risk_severity 0.1452 0.021 6.893 0.000 0.104 0.187
risk_fear_spread 0.0224 0.026 0.848 0.396 -0.029 0.074
risk_fear_contagion_self -0.0313 0.026 -1.206 0.228 -0.082 0.020
risk_fear_contagion_others -0.0046 0.025 -0.185 0.853 -0.053 0.044
risk_fear_restrictions 0.0210 0.014 1.451 0.147 -0.007 0.049
Omnibus: 288.080 Durbin-Watson: 1.928
Prob(Omnibus): 0.000 Jarque-Bera (JB): 1109.978
Skew: -0.581 Prob(JB): 9.36e-242
Kurtosis: 6.221 Cond. No. 45.1


Notes:
[1] R² is computed without centering (uncentered) since the model does not contain a constant.
[2] Standard Errors assume that the covariance matrix of the errors is correctly specified.

Multiple univariate regressions¶

In [37]:
all_coefs_df = multiple_univariate_OLSs(X=X, y=y, features_list=features_list)
In [38]:
all_coefs_df.sort_values("rsquared_adj", ascending=False)
Out[38]:
coef std err t P>|t| [0.025 0.975] rsquared rsquared_adj
automaticity_put_on_mask 0.4652 0.016 29.656 0.0 0.434 0.496 0.279246 0.278928
automaticity_carry_mask 0.4463 0.015 29.076 0.0 0.416 0.476 0.271358 0.271037
effective_means_masks 0.4922 0.017 28.986 0.0 0.459 0.526 0.270137 0.269816
inst_attitude_protects_others 0.5895 0.023 25.875 0.0 0.545 0.634 0.227764 0.227423
norms_people_present_indoors 0.4811 0.019 25.760 0.0 0.444 0.518 0.226196 0.225855
norms_family_friends 0.4673 0.018 25.590 0.0 0.431 0.503 0.223887 0.223545
aff_attitude_responsible 0.5329 0.022 24.703 0.0 0.491 0.575 0.211873 0.211526
aff_attitude_safe 0.4776 0.022 21.636 0.0 0.434 0.521 0.170959 0.170593
norms_risk_groups 0.4905 0.024 20.656 0.0 0.444 0.537 0.158217 0.157847
risk_severity 0.3424 0.017 20.105 0.0 0.309 0.376 0.151148 0.150774
risk_fear_spread 0.3381 0.017 20.084 0.0 0.305 0.371 0.150888 0.150514
inst_attitude_protects_self 0.3992 0.020 19.860 0.0 0.360 0.439 0.148029 0.147654
aff_attitude_calm 0.3492 0.018 19.626 0.0 0.314 0.384 0.145063 0.144687
risk_fear_contagion_self 0.2967 0.016 18.679 0.0 0.266 0.328 0.133229 0.132847
inst_attitude_no_needless_waste 0.2747 0.015 18.436 0.0 0.246 0.304 0.130232 0.129849
aff_attitude_comfortable 0.3678 0.020 18.382 0.0 0.329 0.407 0.129570 0.129186
effective_means_distance 0.3964 0.022 18.291 0.0 0.354 0.439 0.128453 0.128069
risk_fear_contagion_others 0.3281 0.018 18.172 0.0 0.293 0.364 0.126997 0.126612
inst_attitude_enough_oxygen 0.2657 0.016 16.956 0.0 0.235 0.296 0.112417 0.112026
inst_attitude_sense_of_community 0.3166 0.019 16.663 0.0 0.279 0.354 0.108985 0.108592
risk_contagion_absent_protection 0.2862 0.017 16.404 0.0 0.252 0.320 0.105981 0.105587
norms_officials 0.4591 0.028 16.271 0.0 0.404 0.514 0.104444 0.104050
risk_fear_restrictions -0.2310 0.016 -14.264 0.0 -0.263 -0.199 0.082260 0.081856
aff_attitude_difficult_breathing -0.2361 0.019 -12.221 0.0 -0.274 -0.198 0.061729 0.061316
effective_means_ventilation 0.2403 0.020 11.732 0.0 0.200 0.280 0.057165 0.056750
effective_means_handwashing 0.2191 0.023 9.395 0.0 0.173 0.265 0.037432 0.037008
risk_likely_contagion 0.1032 0.025 4.065 0.0 0.053 0.153 0.007225 0.006788
In [39]:
top_feature = all_coefs_df.sort_values("rsquared_adj").tail(1).iloc[0].name
In [40]:
_ = sns.lmplot(data=df, 
               x=target, 
               y=top_feature, 
               hue="demographic_age",
              legend=True)
In [41]:
ax = sns.jointplot(data=df, 
                  x=target, 
                  y=top_feature, 
                  hue="demographic_age",
                  # kind="reg",
                   legend=True
                 )
In [42]:
_ = sns.boxplot(data=all_coefs_df[["rsquared_adj", "P>|t|"]].melt(),
                x="variable", y="value")
_ = plt.axhline(y=0.05, c="grey", ls="--")
In [43]:
mod = sm.OLS(endog=y, exog=X[[top_feature]])
res = mod.fit()
In [44]:
y_pred = res.predict(exog = X[[top_feature]])

df_test = pd.DataFrame({"y_pred": y_pred, target: y})

user_ids_first = df_test.head(1).index.tolist()[0]
user_ids_last = df_test.tail(1).index.tolist()[0]

plot_title="All"
In [45]:
_ = plt.figure(figsize=(30,8))
_ = plt.title(f"Linear Regression(fitted set) | RMSE = {round(np.sqrt(mean_squared_error(df_test['y_pred'], df_test[target])),4)} | bias Error = {round(np.mean(df_test['y_pred'] - df_test[target]), 4)} | {plot_title}")
rmse_plot = plt.stem(df_test.index, df_test['y_pred'] - df_test[target], use_line_collection=True, linefmt='grey', markerfmt='D')
_ = plt.hlines(y=round(np.sqrt(mean_squared_error(df_test['y_pred'], df_test[target])),2), colors='b', linestyles='-.', label='+ RMSE', 
               xmin = user_ids_first, 
               xmax = user_ids_last
              ) 
_ = plt.hlines(y=round(-np.sqrt(mean_squared_error(df_test['y_pred'], df_test[target])),2), colors='b', linestyles='-.', label='- RMSE', 
               xmin = user_ids_first, 
               xmax = user_ids_last
              ) 
_ = plt.xticks(rotation=90, ticks=df_test.index)
_ = plt.ylabel(f"'Error = y_predicted - {target}'")
_ = plt.legend()
_ = plt.show()
In [46]:
groups_dict = {"18 - 39": ['18-29','30-39'],
              "40 - 59": ['40-49', '50-59'],
              "60+": ['60+'],
              "All": ['60+', '40-49', '18-29', '50-59', '30-39'],
              "Lower Education": 0,
              "Higher Education": 1}
In [47]:
all_ols_df = pd.DataFrame()
for group in groups_dict:
    if type(groups_dict[group]) == list:
        tmp_df = df[df["demographic_age"].isin(groups_dict[group])]
    else:
        tmp_df = df[df["demographic_higher_education"] == groups_dict[group]]
        
    tmp_X = tmp_df[features_list]
    tmp_y = tmp_df[target]

    tmp_ols_df = multiple_univariate_OLSs(X=tmp_X, 
                                          y=tmp_y, 
                                          features_list=features_list)[["coef", "P>|t|", "rsquared_adj"]]
    tmp_ols_df.columns = pd.MultiIndex.from_tuples([(group, x) for x in tmp_ols_df.columns.tolist()])
    all_ols_df = pd.concat([all_ols_df, tmp_ols_df], axis=1)
In [48]:
all_ols_df
Out[48]:
18 - 39 40 - 59 60+ All Lower Education Higher Education
coef P>|t| rsquared_adj coef P>|t| rsquared_adj coef P>|t| rsquared_adj coef P>|t| rsquared_adj coef P>|t| rsquared_adj coef P>|t| rsquared_adj
automaticity_carry_mask 0.4454 0.000 0.270860 0.4344 0.000 0.241427 0.3524 0.000 0.209231 0.4463 0.0 0.271037 0.4569 0.0 0.277210 0.4307 0.000 0.260773
automaticity_put_on_mask 0.4799 0.000 0.300845 0.4705 0.000 0.270563 0.3359 0.000 0.187063 0.4652 0.0 0.278928 0.4657 0.0 0.276595 0.4633 0.000 0.280191
inst_attitude_protects_self 0.3929 0.000 0.121830 0.4209 0.000 0.154815 0.2454 0.000 0.083291 0.3992 0.0 0.147654 0.3990 0.0 0.142026 0.3988 0.000 0.154712
inst_attitude_protects_others 0.6651 0.000 0.282943 0.5869 0.000 0.220588 0.4312 0.000 0.171396 0.5895 0.0 0.227423 0.5882 0.0 0.227619 0.5899 0.000 0.224331
inst_attitude_sense_of_community 0.3587 0.000 0.118604 0.3489 0.000 0.124954 0.2154 0.000 0.085699 0.3166 0.0 0.108592 0.3436 0.0 0.122202 0.2827 0.000 0.091534
inst_attitude_enough_oxygen 0.2922 0.000 0.122269 0.2974 0.000 0.129374 0.1746 0.000 0.079068 0.2657 0.0 0.112026 0.2802 0.0 0.116675 0.2461 0.000 0.103277
inst_attitude_no_needless_waste 0.3388 0.000 0.143568 0.2405 0.000 0.088683 0.1493 0.000 0.060426 0.2747 0.0 0.129849 0.2964 0.0 0.142381 0.2496 0.000 0.115458
norms_family_friends 0.4971 0.000 0.232984 0.4680 0.000 0.229955 0.3533 0.000 0.177433 0.4673 0.0 0.223545 0.4736 0.0 0.227569 0.4582 0.000 0.214838
norms_risk_groups 0.4319 0.000 0.127284 0.4867 0.000 0.160871 0.4841 0.000 0.173283 0.4905 0.0 0.157847 0.5176 0.0 0.166838 0.4536 0.000 0.143892
norms_officials 0.4811 0.000 0.097753 0.4472 0.000 0.099572 0.4771 0.000 0.178767 0.4591 0.0 0.104050 0.4940 0.0 0.118794 0.4073 0.000 0.082662
norms_people_present_indoors 0.4411 0.000 0.183380 0.4679 0.000 0.214813 0.4305 0.000 0.218075 0.4811 0.0 0.225855 0.4940 0.0 0.234965 0.4624 0.000 0.212460
aff_attitude_comfortable 0.3836 0.000 0.129391 0.4373 0.000 0.160466 0.2504 0.000 0.100396 0.3678 0.0 0.129186 0.3699 0.0 0.129415 0.3631 0.000 0.126935
aff_attitude_calm 0.3956 0.000 0.155858 0.3652 0.000 0.147922 0.2252 0.000 0.103092 0.3492 0.0 0.144687 0.3505 0.0 0.135067 0.3475 0.000 0.155289
aff_attitude_safe 0.5253 0.000 0.175998 0.4790 0.000 0.158149 0.3195 0.000 0.124132 0.4776 0.0 0.170593 0.4611 0.0 0.158287 0.5023 0.000 0.189416
aff_attitude_responsible 0.5683 0.000 0.221201 0.5268 0.000 0.192626 0.3924 0.000 0.174434 0.5329 0.0 0.211526 0.5522 0.0 0.218736 0.5063 0.000 0.200913
aff_attitude_difficult_breathing -0.2890 0.000 0.084542 -0.2775 0.000 0.073396 -0.1599 0.000 0.047479 -0.2361 0.0 0.061316 -0.2499 0.0 0.062891 -0.2170 0.000 0.056815
effective_means_handwashing 0.1223 0.003 0.010712 0.2206 0.000 0.034616 0.2348 0.000 0.054795 0.2191 0.0 0.037008 0.2980 0.0 0.058513 0.1423 0.000 0.018133
effective_means_masks 0.4607 0.000 0.243742 0.5072 0.000 0.253820 0.3930 0.000 0.205402 0.4922 0.0 0.269816 0.4998 0.0 0.262242 0.4820 0.000 0.280107
effective_means_distance 0.3681 0.000 0.122005 0.3559 0.000 0.098371 0.3042 0.000 0.068855 0.3964 0.0 0.128069 0.4252 0.0 0.146112 0.3531 0.000 0.101973
effective_means_ventilation 0.2187 0.000 0.038280 0.2245 0.000 0.049446 0.1737 0.000 0.044676 0.2403 0.0 0.056750 0.2577 0.0 0.063811 0.2133 0.000 0.045202
risk_likely_contagion 0.2217 0.000 0.025561 0.1058 0.014 0.006315 0.0839 0.015 0.006725 0.1032 0.0 0.006788 0.1307 0.0 0.009772 0.0680 0.054 0.002584
risk_contagion_absent_protection 0.3651 0.000 0.147379 0.3077 0.000 0.112138 0.1813 0.000 0.074338 0.2862 0.0 0.105587 0.3032 0.0 0.114076 0.2628 0.000 0.093314
risk_severity 0.3577 0.000 0.110056 0.2932 0.000 0.094976 0.2457 0.000 0.091535 0.3424 0.0 0.150774 0.3635 0.0 0.156173 0.3225 0.000 0.148234
risk_fear_spread 0.3703 0.000 0.165041 0.3513 0.000 0.146157 0.2056 0.000 0.087899 0.3381 0.0 0.150514 0.3532 0.0 0.157263 0.3183 0.000 0.141221
risk_fear_contagion_self 0.2942 0.000 0.115572 0.3128 0.000 0.129061 0.1717 0.000 0.067572 0.2967 0.0 0.132847 0.3048 0.0 0.133445 0.2878 0.000 0.133083
risk_fear_contagion_others 0.3810 0.000 0.153481 0.3669 0.000 0.150451 0.1870 0.000 0.065151 0.3281 0.0 0.126612 0.3477 0.0 0.132023 0.3040 0.000 0.119403
risk_fear_restrictions -0.2346 0.000 0.079227 -0.2402 0.000 0.076614 -0.1316 0.000 0.040946 -0.2310 0.0 0.081856 -0.2180 0.0 0.066313 -0.2465 0.000 0.104504
In [49]:
_ = plt.figure(figsize=(20,10))
_ = sns.heatmap(data=all_ols_df.sort_values(by = ("All", "coef"), ascending=False),
               annot=True)
_ = plt.xlabel("")
In [50]:
bootstrap_sample_size = 1000
bootstrap_number = 100
In [51]:
all_pcis_df = pd.DataFrame()
for i in range(0, bootstrap_number):
    tmp_pci_df = potential_for_change_index(data=df
                                            .sample(n=bootstrap_sample_size, random_state=0 + i)
                                            .drop(["demographic_age", "demographic_higher_education"], axis=1),
                                           features_list=features_list,
                                            target=target,
                                            minimum_measure = 'min',
                                            centrality_measure = 'mean',
                                            maximum_measure = 'max',
                                            weight_measure = 'r-value',
                                            scale_data = False,
                                            pci_heatmap = False,)
    all_pcis_df = pd.concat([all_pcis_df, tmp_pci_df["PCI"]], axis=1)
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
In [52]:
all_pcis_df.columns = [f"PCI_{x}" for x in range(0, all_pcis_df.shape[1])]
In [53]:
_ = plt.figure(figsize=(6, 8))
_ = sns.heatmap(all_pcis_df.sort_values(by="PCI_0", ascending=False))
In [54]:
_ = plt.figure(figsize=(6, 8))
_ = sns.heatmap(all_pcis_df.agg(["min", "mean", "median", "max"], axis=1).sort_values(by="mean", ascending=False),
               annot=True,
               fmt=".3g")
In [55]:
all_pcis_df = pd.DataFrame()
for group in groups_dict:
    if type(groups_dict[group]) == list:
        tmp_df = df[df["demographic_age"].isin(groups_dict[group])]
    else:
        tmp_df = df[df["demographic_higher_education"] == groups_dict[group]]

    tmp_pci_df = potential_for_change_index(data=tmp_df.drop(["demographic_age", "demographic_higher_education"], axis=1),
                                           features_list=features_list,
                                            target=target,
                                            minimum_measure = 'min',
                                            centrality_measure = 'mean',
                                            maximum_measure = 'max',
                                            weight_measure = 'r-value',
                                            scale_data = True,
                                            pci_heatmap = False,)
    tmp_pci_df = tmp_pci_df.rename(columns={"PCI": f"PCI_{group}"})
    all_pcis_df = pd.concat([all_pcis_df, tmp_pci_df[f"PCI_{group}"]], axis=1)
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
In [56]:
_ = plt.figure(figsize=(6, 8))
_ = sns.heatmap(all_pcis_df.sort_values(by="PCI_All", ascending=False),
               annot=True,
                fmt=".3g"
               )
In [57]:
all_group_pci_df = potential_for_change_index(data=df.drop(["demographic_age", "demographic_higher_education"], axis=1),
                                           features_list=features_list,
                                            target=target,
                                            minimum_measure = 'min',
                                            centrality_measure = 'mean',
                                            maximum_measure = 'max',
                                            weight_measure = 'r-value',
                                            scale_data = True,
                                            pci_heatmap = False,)
all_group_pci_df = all_group_pci_df.rename(columns={"PCI": "PCI_All"})
/opt/miniconda3/envs/ds_env/lib/python3.10/site-packages/jmspack/frequentist_statistics.py:213: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  info = info.append(
In [58]:
meta_df.columns
Out[58]:
Index(['Original position', 'Variable name', 'Label',
       'Item english translation ', 'Label short', 'Type', 'New variable name',
       'variable name helper',
       'Of primary interest as a predictor (i.e. feature)?',
       'English lo-anchor', 'English hi-anchor'],
      dtype='object')
In [59]:
relabel_column = "Label short" # "Item english translation "

pd.set_option("display.max_colwidth", 0)
pd.set_option("display.width", 0)
all_group_pci_df_new=(pd.merge(all_group_pci_df, 
         meta_df.loc[meta_df["New variable name"].isin(features_list), [relabel_column, 'English lo-anchor', 'English hi-anchor', "New variable name"]],
         left_index=True,
         right_on="New variable name")
 .set_index([relabel_column, 'English lo-anchor', 'English hi-anchor', "New variable name"])
#  .drop("New variable name", axis=1)
)
In [60]:
pd.set_option("styler.format.precision", 3)
(all_group_pci_df_new
 .reindex(all_group_pci_df_new["PCI_All"].abs().sort_values(ascending=False).index)
#  .sort_values(by="PCI_All", ascending=False)
 .round(3)
 .style.bar(subset=['PCI_All', 'mean', "r-value"], 
            align='mid', 
            color=['#d65f5f', '#5fba7d'])
)
Out[60]:
        PCI_All min mean max r-value p-value
Label short English lo-anchor English hi-anchor New variable name            
When I use a face mask, I feel or would feel ... Very uncomfortable Very comfortable aff_attitude_comfortable 0.205 0.000 0.431 1.000 0.360 0.000
Perceived risk coronavirus infection with no protective behaviours Very unlikely Very likely risk_contagion_absent_protection 0.188 0.000 0.421 1.000 0.326 0.000
If or when I use a face mask… I produce unnecessary waste I do not produce unnecessary waste inst_attitude_no_needless_waste 0.176 0.000 0.511 1.000 0.361 0.000
I would get infected myself .. Doesn't scare me Scares me risk_fear_contagion_self 0.170 0.000 0.535 1.000 0.365 0.000
Perceived risk severity coronavirus infection Not serious at all Very serious risk_severity 0.168 0.000 0.567 1.000 0.389 0.000
Spread of coronavirus… Doesn't scare me Scares me risk_fear_spread 0.163 0.000 0.582 1.000 0.388 0.000
When I use a face mask, I feel or would feel ... Very anxious Very calm aff_attitude_calm 0.155 0.000 0.593 1.000 0.381 0.000
Very insecure Very safe aff_attitude_safe 0.141 0.000 0.660 1.000 0.413 0.000
If or when I use a face mask… I get enough oxygen I don't get enough oxygen inst_attitude_enough_oxygen 0.137 0.000 0.592 1.000 0.335 0.000
When I use a face mask, I feel or would feel ... Very easy to breathe Very difficult to breathe aff_attitude_difficult_breathing -0.134 0.000 0.540 1.000 -0.248 0.000
If or when I use a face mask… It decreases sense of community It increases sense of community inst_attitude_sense_of_community 0.125 0.000 0.621 1.000 0.330 0.000
Is taking a mask with you automatic for you? Not at all automatic Fully automatic automaticity_carry_mask 0.125 0.000 0.761 1.000 0.521 0.000
Using a face mask Ineffective Effective effective_means_masks 0.117 0.000 0.776 1.000 0.520 0.000
Is putting on a mask automatic for you? Not at all automatic Fully automatic automaticity_put_on_mask 0.116 0.000 0.781 1.000 0.528 0.000
Measures taken to prevent the spread Doesn't scare me Scares me risk_fear_restrictions -0.115 0.000 0.401 1.000 -0.287 0.000
If or when I use a face mask… I expose myself to coronavirus infection I protect myself from coronavirus infection inst_attitude_protects_self 0.102 0.000 0.734 1.000 0.385 0.000
Loved one would get infected... Doesn't scare me Scares me risk_fear_contagion_others 0.100 0.000 0.720 1.000 0.356 0.000
In the indoors spaces I visit, people on the site think I should… Not to use a mask Use a mask norms_people_present_indoors 0.094 0.000 0.802 1.000 0.476 0.000
When I use a face mask, I feel or would feel ... Very irresponsible Very responsible aff_attitude_responsible 0.090 0.000 0.804 1.000 0.460 0.000
My family and friends think I should .. Not to use a mask Use a mask norms_family_friends 0.085 0.000 0.820 1.000 0.473 0.000
Perceived risk coronavirus infection Very unlikely Very likely risk_likely_contagion 0.069 0.000 0.187 1.000 0.085 0.000
If or when I use a face mask… I expose others to coronavirus infection I protect others from coronavirus infection inst_attitude_protects_others 0.066 0.000 0.863 1.000 0.477 0.000
Keeping a safety distance (2 meters) Ineffective Effective effective_means_distance 0.062 0.000 0.826 1.000 0.358 0.000
Ventilation Ineffective Effective effective_means_ventilation 0.059 0.000 0.754 1.000 0.239 0.000
People at risk think I should .. Not to use a mask Use a mask norms_risk_groups 0.044 0.000 0.889 1.000 0.398 0.000
Hand washing and use of gloves Ineffective Effective effective_means_handwashing 0.029 0.000 0.851 1.000 0.193 0.000
The authorities think I should .. Not to use a mask Use a mask norms_officials 0.027 0.000 0.918 1.000 0.323 0.000
In [61]:
cmap = cmap=sns.diverging_palette(5, 250, as_cmap=True)
(all_group_pci_df_new
    .reindex(all_group_pci_df_new["PCI_All"].abs().sort_values(ascending=False).index)
#  .sort_values(by="PCI_All", ascending=False)
#  .round(3)
 .style.background_gradient(cmap, 
                            subset=['PCI_All'], 
                            axis=1, 
                            vmin=-0.15, 
                            vmax=0.25)
)
Out[61]:
        PCI_All min mean max r-value p-value
Label short English lo-anchor English hi-anchor New variable name            
When I use a face mask, I feel or would feel ... Very uncomfortable Very comfortable aff_attitude_comfortable 0.205 0.000 0.431 1.000 0.360 0.000
Perceived risk coronavirus infection with no protective behaviours Very unlikely Very likely risk_contagion_absent_protection 0.188 0.000 0.421 1.000 0.326 0.000
If or when I use a face mask… I produce unnecessary waste I do not produce unnecessary waste inst_attitude_no_needless_waste 0.176 0.000 0.511 1.000 0.361 0.000
I would get infected myself .. Doesn't scare me Scares me risk_fear_contagion_self 0.170 0.000 0.535 1.000 0.365 0.000
Perceived risk severity coronavirus infection Not serious at all Very serious risk_severity 0.168 0.000 0.567 1.000 0.389 0.000
Spread of coronavirus… Doesn't scare me Scares me risk_fear_spread 0.163 0.000 0.582 1.000 0.388 0.000
When I use a face mask, I feel or would feel ... Very anxious Very calm aff_attitude_calm 0.155 0.000 0.593 1.000 0.381 0.000
Very insecure Very safe aff_attitude_safe 0.141 0.000 0.660 1.000 0.413 0.000
If or when I use a face mask… I get enough oxygen I don't get enough oxygen inst_attitude_enough_oxygen 0.137 0.000 0.592 1.000 0.335 0.000
When I use a face mask, I feel or would feel ... Very easy to breathe Very difficult to breathe aff_attitude_difficult_breathing -0.134 0.000 0.540 1.000 -0.248 0.000
If or when I use a face mask… It decreases sense of community It increases sense of community inst_attitude_sense_of_community 0.125 0.000 0.621 1.000 0.330 0.000
Is taking a mask with you automatic for you? Not at all automatic Fully automatic automaticity_carry_mask 0.125 0.000 0.761 1.000 0.521 0.000
Using a face mask Ineffective Effective effective_means_masks 0.117 0.000 0.776 1.000 0.520 0.000
Is putting on a mask automatic for you? Not at all automatic Fully automatic automaticity_put_on_mask 0.116 0.000 0.781 1.000 0.528 0.000
Measures taken to prevent the spread Doesn't scare me Scares me risk_fear_restrictions -0.115 0.000 0.401 1.000 -0.287 0.000
If or when I use a face mask… I expose myself to coronavirus infection I protect myself from coronavirus infection inst_attitude_protects_self 0.102 0.000 0.734 1.000 0.385 0.000
Loved one would get infected... Doesn't scare me Scares me risk_fear_contagion_others 0.100 0.000 0.720 1.000 0.356 0.000
In the indoors spaces I visit, people on the site think I should… Not to use a mask Use a mask norms_people_present_indoors 0.094 0.000 0.802 1.000 0.476 0.000
When I use a face mask, I feel or would feel ... Very irresponsible Very responsible aff_attitude_responsible 0.090 0.000 0.804 1.000 0.460 0.000
My family and friends think I should .. Not to use a mask Use a mask norms_family_friends 0.085 0.000 0.820 1.000 0.473 0.000
Perceived risk coronavirus infection Very unlikely Very likely risk_likely_contagion 0.069 0.000 0.187 1.000 0.085 0.000
If or when I use a face mask… I expose others to coronavirus infection I protect others from coronavirus infection inst_attitude_protects_others 0.066 0.000 0.863 1.000 0.477 0.000
Keeping a safety distance (2 meters) Ineffective Effective effective_means_distance 0.062 0.000 0.826 1.000 0.358 0.000
Ventilation Ineffective Effective effective_means_ventilation 0.059 0.000 0.754 1.000 0.239 0.000
People at risk think I should .. Not to use a mask Use a mask norms_risk_groups 0.044 0.000 0.889 1.000 0.398 0.000
Hand washing and use of gloves Ineffective Effective effective_means_handwashing 0.029 0.000 0.851 1.000 0.193 0.000
The authorities think I should .. Not to use a mask Use a mask norms_officials 0.027 0.000 0.918 1.000 0.323 0.000
In [62]:
!jupyter nbconvert --to html PCI_lin_reg_corrs_clean.ipynb
[NbConvertApp] Converting notebook PCI_lin_reg_corrs_clean.ipynb to html
[NbConvertApp] Writing 2926035 bytes to PCI_lin_reg_corrs_clean.html