In this paper we develop a methodology for identifying a population group surveyed latently in the (target) survey relevant for further processing, for example poverty calculations, but surveyed explicitly in another (source) survey, not suitable for such processing. Identification is achieved by transferring the binary information from the source survey to the target survey by means of a logistic regression determining group affiliation in the source survey by use of variables available also in the target survey. In the proposed methodology we improve on common matching procedures by optimizing the cut-value of the probability which assigns group affiliation in the target survey. This contrasts with the commonly used "Hosmer-Lemeshov" cut-values for binary categorization, which equates between the sensitivity and specificity curves. Instead we improve group identification by minimizing the sum of total errors as a percent of total true outcomes.
The Jewish ultra-orthodox population in Israel serves as a case study. This idiosyncratic community, committed to the observance of the Bible is only latently observed in the surveys typically used for poverty calculation. It is explicitly captured in the social survey, which is not suitable for poverty measurement.
This procedure is useful for ex-post enhancement of survey data in general.