Research

Refereed Publications

2021

Variants of Mixtures: Information Properties and Applications

co-authored with Majid Asadi, Nader Ebrahimi, and Ehsan Soofi,  Journal of the Iranian Statistical Society, forthcoming.

Abstract: In recent years, we have studied information properties of various types of mixtures of probability distributions and introduced a new type, which includes previously known mixtures as special cases. These studies are disseminated in different fields: reliability engineering, econometrics, operations research, probability, the information theory, and data mining. This paper presents a holistic view of these studies and provides further insights and examples. We note that the insightful probabilis- tic formulation of the mixing parameters stipulated by Behboodian (1972) is required for a representation of the well-known information measure of the arithmetic mixture. Applications of this information measure presented in this paper include lifetime modeling, system reliability, measuring uncertainty and disagreement of forecasters, probability modeling with partial information, and information loss of kernel estimation. Probabilistic formulations of the mixing weights for various types of mixtures provide the Bayes-Fisher information and the Bayes risk of the mean residual function.

Prediction of Days-On-Market for Single-Family Homes in the Housing Market

co-authored with Keagan Galbraithy, Ray Hashemi, and Jason Beck, Proceedings of the International Conference on Data Science, forthcoming.

Abstract: The number of days that a home stays on the housing market (Days-On-Market—DOM) provides crucial information at both microlevel (behavior associated with the buyer’s/seller’s decision) and macro level (risk associated with real estate investments and also housing bubbles’ identification). Housing data has a mixture of simple and complex attributes. A complex attribute in contrast with a simple attribute, has an array of values for a real estate property, which creates a major challenge in prediction of DOM. The goals of this research effort are: (a) Providing for complex attributes in DOM’s prediction, (b) Analyzing, designing, and implementing a DOM prediction’s package using Naïve Bayesian and Linear Regression separately, and (c) Establishing the superiority and robustness of the underline models.

2020

MR Plot: A Big Data Tool for Distinguishing Distributions

co-authored with Majid Asadi, Nader Ebrahimi, and Ehsan Soofi,  Statistical Analysis and Data Mining: The American Statistical Association Data Science Journal, 2020, 13, 405-418. DOI: 10.1002/sam.11464.

Abstract: Big data enables reliable estimation of continuous probability density, cumu- lative distribution, survival, hazard rate, and mean residual functions (MRFs). We illustrate that plot of the MRF provides the best resolution for distinguish- ing between distributions. At each point, the MRF gives the mean excess of the data beyond the threshold. Graph of the empirical MRF, called here the MR plot, provides an effective visualization tool. A variety of theoretical and data driven examples illustrate that MR plots of big data preserve the shape of the MRF and complex models require bigger data. The MRF is an optimal predictor of the excess of the random variable. With a suitable prior, the expected MRF gives the Bayes risk in the form of the entropy functional of the survival function, called here the survival entropy. We show that the survival entropy is dominated by the standard deviation (SD) and the equality between the two measures character- izes the exponential distribution. The empirical survival entropy provides a data concentration statistic which is strongly consistent, easy to compute, and less sensitive than the SD to heavy tailed data. An application uses the New York City Taxi database with millions of trip times to illustrate the MR plot as a powerful tool for distinguishing distributions.

A Mediated Multi-RNN Hybrid System for Prediction of Stock Prices

co-authored with Ray Hashemi, Azita Bahrami, and Jeffrey Young, Proceedings of the International Conference on Computational Science and Computational Intelligence, forthcoming.

Abstract: A multi-recurrent neural network (RNN) hybrid system made up of three RNNs is introduced to predict the stock prices for 10 different companies (five selected from the Dow Jones Industrial Average and five from the Standard and Poor’s 500.) The daily historical data used to train and test the system are collected for the period of October 15, 2013 to March 5, 2019. For each company, the system provides two separate predictions of the daily stock price by using (1) historical stock prices and (2) historical trends along with the historical daily net changes in stock price. The two predictions are mediated to select one as the final output of the hybrid system. For each company, the accuracy of the system was tested for the prediction of the most recent 98 consecutive days using the forecast accuracy measure of the Mean Squared Error (MSR). The results revealed that for every company the difference between the predicted and actual stock price is not statistically different from zero, which is the ideal (error-free) forecast.

2018

Re-evaluating the Effectiveness of Inflation Targeting

co-authored with Kundan Kishor and Suyong Song , Journal of Economic Dynamics and Control, 2018, 90, 76-97. DOI: 10.1016/j.jedc.2018.01.045.

Abstract: This paper estimates the treatment effect of inflation targeting on macroeconomic variables using a semiparametric single index method by taking into account the model misspecification of parametric propensity scores. Our study uses a broader set ofpreconditions for inflation targeting and macroeconomic outcome variables than the existing literature. The results suggest no significant difference in the inflation level and inflation volatility between targeters and non-targeters after the adoption of inflation targeting. We find that inflation targeting reduces sacrifice ratio and interest rate volatility in the developed economies, and that it enhances fiscal discipline in both the industrial and developing countries.

Ranking Forecasts by Stochastic Error Distance, Information and Reliability Measures

co-authored with Nader Ebrahimi and Ehsan Soofi, International Statistical Review, 2018, 86(3), 442-468. DOI: 10.1111/insr.12250. Online Appendix.

Abstract: The stochastic error distance (SED) introduced by Diebold and Shin (2017) ranks forecast models by divergence between distributions of the errors of the actual and perfect forecast models. The basic SED is defined by the variation distance and provides a representation of the mean absolute error, but by basing ranking on the entire error distribution and divergence, the SED moves beyond the traditional forecast evaluations. First, we establish connections between ranking forecast models by the SED, error entropy and some partial orderings of distributions. Then, we introduce the notion of excess error for forecast errors of magnitudes larger than a tolerance threshold and give the SED representation of the mean excess error (MEE). As a function of the threshold, the MEE is a local risk measure. With the distribution of the absolute error as a prior for the threshold, its Bayes risk is the entropy functional of the survival function, which is a known measure in the information theory and reliability. Notions and results are illustrated using various distributions for the error. The empirical versions of SED, MEE and its Bayes risk are compared with the mean squared error in ranking regression and autoregressive integrated moving average models for forecasting bond risk premia.

Examining the Success of the Central Banks in Inflation Targeting Countries: The Dynamics of the Inflation Gap and Institutional Characteristics

co-authored with Kundan Kishor, Studies in Nonlinear Dynamics and Econometrics, 2018, 22(1). DOI: 10.1515/snde-2016-0085.

Abstract: This paper analyzes the performance of the central banks in inflation targeting (IT) countries by examining their success in achieving their explicit inflation targets. For this purpose, we decompose the inflation gap, the difference between actual inflation and the inflation target, into predictable and unpredictable components. We argue that the central banks are successful if the predictable component diminishes over time. The predictable component of the inflation gap is measured by the conditional mean of a parsimonious time-varying autoregressive model. Our results find considerable heterogeneity in the success of these IT countries in achieving their targets at the start of this policy regime. Our findings suggest that the central banks of the IT adopting countries started targeting inflation implicitly before becoming an explicit inflation targeter. The panel data analysis suggests that the relative success of these countries in reducing the gap is influenced by their institutional characteristics, particularly fiscal discipline and macroeconomic performance.

A Mining Driven Decision Support System for Joining the European Monetary Union

co-authored with Ray Hashemi, Azita Bahrami, Jeffrey Young, and Rosina Campbell. Proceedings of the International Conference on Advances in Information Mining and Management, 2018, 39-45. ISBN: 978-1-61208-654-5.

Abstract: The European Monetary Union (EMU) is a result of an economic integration of European Union member states into a unified economic system. The literature is divided on whether the EMU members benefit from this monetary unification. Considering costs and benefits, a fiscal authority may ask whether it is a good decision to join the EMU. We introduce and develop a decision support system to answer the proposed question using a historical dataset of twelve Macroeconomic Outcomes (MOs) obtained for 31 European countries and for 18 years (1999-2016). The system meets the three-prong goal of: (1) identifying highly relevant MOs for a given year, yi, using the data from years y1 to yi; (2) deriving decision of “join/not-join” the EMU along with its certainty factor using the relevant MOs for yi; and (3) examining the accuracy of the derived decision using the data from yi+1 to y18. The performance analysis of the system reveals that (a) the number of relevant MOs has declined nonlinearly over time, (b) the relevant MOs and decisions are significantly changed before and after the European debt crisis, and (c) the derived decisions by the system has 79% accuracy.

2017

Extraction of the Essential Constituents of the S&P 500 Index

co-authored with Ray Hashemi, Azita Bahrami, and Jeffrey Young, Proceedings of the International Conference on Computational Science and Computational Intelligence, 2017, 350-356. DOI: 10.1109/CSCI.2017.59.

Abstract: Standard and Poor’s ranks S&P 500 components based on a weighting scheme and identifies a set of top companies. The weighting scheme relies only on individual company’s market value and ignores the impact of collective market values on the index. We introduce a ranking methodology based on entropy which results in a new set of top components. Then we compare its predictability power in reference to the index. For this comparison, we develop a method based on Markov Chain and Hidden Markov Chain models. The results reveal that the set of top companies identified by the entropy approach provides a more accurate prediction of the S&P 500 index.

2016

Doctoral Dissertations in Economics

Journal of Economic Literature, 54(4), 2016, 1551-1580. DOI: 10.1257/jel.54.4.1551.

Working Papers

Option Valuation with Maximum Entropy Densities: Accounting for Higher-Order Moments

Abstract: An alternative approach to the Black-Scholes-Merton formulation of option valuation is the entropy pricing theory. Entropy pricing applies notions of information theory to derive the theoretical value of options. I elaborate further on the maximum entropy formulation of option pricing using a generalized set of moment constraints. higher-order moments contain more information about the price density and characterize the shape of the underlying distribution. In a Monte Carlo study, I present entropies of heavy-tailed distributions and show that entropic call densities vary with constraints and become closer to each other as the order of moments increases. In an empirical analysis using high-frequency S&P 500 index options, I examine the impact of moment constraints on the accuracy of theoretical values. Simulation and empirical evidence suggest that the entropic pricing framework provides more accurate results for heavy-tailed, high-frequency data when higher-order moment constraints are imposed.

An Information Framework for Measuring Perception Alignment in Financial Markets

co-authored with Viktoria Dalko and Hyeeun Shim.

Abstract: At the onset of the COVID-19 pandemic, the CBOE volatility index reached heights last experienced during the 2008 financial crisis. The consensus is that the World Health Organization’s announcement of the pandemic contributed to the high level of volatility. The question arises whether we have a potentially robust measure to quantify the degree that investors’ perceptions were suddenly aligned about future asset returns due to a WHO announcement. This paper provides an information framework to propose measuring the degree of perception alignment based on the perception alignment hypothesis. We provide simulation examples and illustrate empirical evidence of financial market manipulation, and estimate the loss of information due to those cases of perception alignment.

Estimating hedonic models with endogenous marketing time using quantile regression without excluded instruments

co-authored with Jason Beck and Suyong Song.

Abstract: Hedonic modeling can be used to examine the impacts of housing characteristics on selling prices. This paper examines a hedonic pricing model for single-family houses in Savannah, GA, for the period 2007–2016. Digressing from conventional hedonic modeling, we consider a structural function whereby the home sale price is directly affected by the usual house attributes and marketing time. Both the sale price and time on the market, however, are endogenously determined. To account for endogeneity, we estimate the structural hedonic function using a control function approach. The control-function estimator utilizes conditional heteroscedasticity of structural errors in the triangular model. Using this approach, we identify the relationship between the house price and its time on the market solely based on nonlinearities in the control function without looking for instruments. We further account for heterogeneous effects of marketing time using a control function based on quantile regression. Our findings suggest housing prices increase with marketing time, and the marketing time impact is nonlinearly larger for higher selling prices. That is, homes with lower prices are sold quickly, while expensive houses tend to stay on the market for an extended period.

Does Membership of the EMU Matter for Economic and Financial outcomes?

co-authored with Kundan Kishor and Suyong Song.

Abstract: We examine treatment effects of joining the European monetary union (EMU) on macroeconomic and financial outcomes in member countries. Specifically, we apply propensity score analysis to mitigate the self-selection bias associated with the non-random nature of joining the union. The findings suggest that average treatment effect on the treated (ATT) of the EMU is associated with decline in volatility of inflation, real GDP growth and bond yields. Splitting the sample into the pre-crisis (1990-2008) and the post-crisis (2009-2019) periods and exclusion of Portugal, Ireland, Greece and Spain (PIGS) from the sample show divergent pattern of ATTs on bond yields and the debt-GDP ratio. The results suggest that the fiscal situation in the member states that excluded PIGS worsened in the pre-crisis period. We also find that PIGS benefitted from the EMU membership in terms of lower bond yields in the pre-crisis period.

Estimating loss from extreme climate events within a real options approach

co-authored with Ruth Dittrich.

Abstract: Sea level rise is a major consequences of climate change. This paper studies climate change uncertainty through an information theory framework and examines the current cost of extreme sea level rise within a real options analysis. We first propose an approach to estimate the risk-neutral density of change in global mean sea level and then use the estimated density to compute the expected overall cost from sea level rise. The proposed framework accounts for extreme sea level rise in computing the theoretical option value.