Renjie Zhong

Econ PhD Student
Columbia University

Home

Research

CV

Teaching

Social

Working Papers

Selling Training Data (New Draft Coming Soon)
with Jingmin Huang and Wei Zhao
What is the optimal mechanism for selling supplementary datasets?

In this paper, we develop a framework to analyze the design and price of supplemental training dataset for hypothesis testing. A monopolistic seller versions training datasets and associated tariffs to screen data buyers with different private datasets. Three characteristics are relevant in this set-up, the coexistence of both horizontal and vertical differences, the obedience constraints and the possibilities of double deviation. We show that exclusion of double deviation imposes rigidity of menu structure brought by multi-dimension nature of data allocation, reducing dimension of the design problem and leading to two-tier structure as its extreme point. The seller can exploit the horizontal difference to neutralize the vertical difference, through subtly designing the lower-tiered dataset to nullify the impact of private dataset. Such operation can maintain high price for higher tiered dataset without excluding low-valued buyers. The obedience constraints impose the limit of the exploitation.



Works in Progress

Optimal Data Procurement with Tests
What is the optimal data buying mechanism when the buyer has test data?

Coming soon.



Note

On the Existence of Fully Informative Experiment in Optimal Menu
An alternative way to understand no distortion in optimal information selling mechanism.

In their analysis of screening with information products, Bergemann et al. (2018) showed the existence of fully informative information in optimal menu, which constitutes a core foundation for the proof of almost all key theorems in the paper. We provide an additional proof to its existence in the direct mechanism framework and reemphasize the key difference between screening with tradition products and information products lies in the non-congruent preference order for information products due to different priors and the existence of common most-preferred products, i.e. fully informative experiment, thus emphasizing the rents extraction balancing vertical and horizontal values of information.