• Comment: Please do not continue to submit without trying to remedy issues. Please look at other biographies to see how this should be formatted. Also read WP:NPROF to see what is needed. Ldm1954 (talk) 12:01, 27 December 2023 (UTC)
  • Comment: Please see the tutorial on in-line referencing at WP:INTREFVE. Qcne (talk) 16:09, 17 October 2023 (UTC)

Boris Grigorevich Mirkin (Russian: Борис Григорьевич Миркин) is a Russian scientist in data analysis and decision-making methodologies born December 5, 1942.  He graduated from the Faculty of Mathematics and Mechanics of Saratov State University in Russia (1964) and defended there a “Candidate of Sciences” degree thesis (1966), which is equivalent to an international PhD degree, on a subject in the abstract automata theory. That was based on correspondences between regular expressions and abstract automata noted by Boris Mirkin and his PhD supervisor Mark Spivak. Prof. J. Brzozowski (University of Waterloo Canada) described these results in several synopses in the Journal of Symbolic Logic in 1969–1971[1], leading to recognition of what is sometimes referred to as Mirkin’s prebase. [2]

In 1967, B. Mirkin moved to work in the Institute of Economics, Siberian Branch of the USSR Academy of Sciences, Novosibirsk, Russia. There, he started research on binary relations as a medium for decision making since he considered socio-economic decisions as mostly not quantitative (at least, so it was in the USSR). Three results in this direction can be mentioned:

(a)  A different characterization of interval orders (1970, published internationally 1972;[3]

(b)  Development of what later was called Mirkin’s distance between partitions;[4]

(c)  Extension of the celebrated Arrow’s theorem of impossibility of democratic choice to arbitrary relations; in its latest version, that was a characterization of Arrow’s two axioms (the monotonicity and the “independence of irrelevant alternatives”) as the “federation consensus rules”. Given a set of actors with their preference relations Ri, a federation is a set of actor coalitions S={s} so that a federation rule has a set-theoretic format,

                                   F(R1,R2,…,Rn)=ÈsÎSÇ iÎs Ri [5] [6] [7]

B. Mirkin’s first monograph, on mathematics of group choice [8], propelled him into top ranks of the Soviet mathematics-economics research community, and also opened a way to do research and get published on this subject by scientists in Soviet satellite countries such as Bulgaria.

Then, motivated by a founding father of the Soviet pattern recognition and data science research E. Braverman (1931-1977), B. Mirkin, together with his collaborators, started a program of research projects in mixed scale data analysis oriented towards practical applications. References to E. Braverman can be found in a later volume. [8] Among Mirkin’s results of that period one should mention the following:

(d)   Developing a machinery, both models, methods and codes, for finding, in similarity and interaction data, approximate partitions and individual clusters, as well as more weird structures such as a “structured partition” or “chain order partition”, along with interesting applications in genetics, organizational structure design, sociology, and ecology.

(e)   Categorical factor analysis, embracing what is referred to as additive cluster analysis, that is sequential extraction of various cluster structures from similarity matrices, together with estimations of contribution of individual structures to the total variance.

(f)    Matrix correlation for mixed scale data including quantitative and nominal features, leading to a very successful method for bi-clustering followed by methods for “relative grouping” of sociology surveys over one group of features (say, demographics) with respect to another group of features (say, leisure time behavior).

Unfortunately, most of these were published in Russian only, and currently are mainly forgotten. Three monographs by B. Mirkin [9] should be mentioned, though, as well as edited by him collections [10]. Some traces of these developments can be found in. [11]

From 1983-1991, B. Mirkin worked in CEMI, Central Economics-Mathematics Institute, Moscow, USSR. This result of that period should be mentioned:

(g)   Considering clustering within what is currently called the auto-encoder paradigm (referred to by Boris as his “data recovery approach”), B. Mirkin embraces Principal Component Analysis and k-means clustering in a unified framework, currently referred to as matrix factorization model, and proposes a method for “Principal cluster analysis”, currently referred to by him as Anomalous clustering which proved effective in many different frameworks. [12] [13][14]

From 1991-2011, B. Mirkin travels extensively, completing that with an appointment in Birkbeck University of London UK (2000-2011). These results of that period are worth mentioning:

(h)   Building an outline of a sound mathematical theory for non-probabilistic data analysis including a Pythagorean decomposition of the square data scatter in the sum of two items, the k-means square error criterion, the unexplained part, and a complementary criterion, the explained part. The explained part sheds a really new light on such issues as the “real” goal of k-means clustering (finding big anomalous clusters, according to B. Mirkin) and the contributions of nominal features, which appear to coincide with various measures of deviation from the statistical independence, including the celebrated Pearson’s chi-squared association index, which also appear to relate to the data normalization scaling utilized, etc..[15]

(i)     Developing the concept of “Quetelet index”, a measure of association between feature categories, which allowed him both to unify “the dual” approaches of the so-called Analyse des Correspondances developed by Benzecri in France [16], and give an operational meaning to the Pearson’s chi-squared association index which has been ubiquitously considered as a criterion of statistical independence only. [17]

(j)     Mathematically modelling inconsistencies between individual gene families and the structure of the evolutionary “species” tree via histories of gene gain and loss events steering the team led by the celebrated E. Koonin to reconstruction of LUCA, the Last Universal Common Ancestor genome.[18], as well as that for lactic acid bacteria [19]. Original models involving duplication events were developed by Mirkin and co-authors as well [14]

(k)   Consensus clustering via the “projective” distance between partitions, which allows for mathematically equivalent formulations of the same criterion in three different frameworks: similarity between objects, object-to-category data table, and association indexes over contingency tables. [15][20]

From 2011, B. Mirkin works in Higher School of Economics Moscow. These developments by B. Mirkin and Co should be mentioned as of that period:

(l)      Core-shell cluster as a cluster with explicitly indicated “core” of tighter connected elements proved fundamental in the three-step cluster analysis of the off-coastal upwelling phenomenon over annual sea surface temperature data [21] (m)  Modeling conceptual generalization as optimally lifting fuzzy leaf sets in a domain taxonomy to minimize the numbers of “gaps” and “offshoots” leading to derivation of research domain tendencies or extending audience sizes for internet advertisements.[22][23]

References edit

  1. ^ Brzozowski, J. A. (1971). BG Mirkin, An Algorithm for Constructing a Base in a Language of Regular Expressions. Journal of Symbolic Logic, 36(4); Brzozowski, J. A. (1972). MA Spivak. Algorithm for abstract synthesis of automata for an expanded language of regular expressions.The Journal of Symbolic Logic, 37(3), 620-620
  2. ^ Goldengorin, B. (2023), From Prebase in Automata Theory to Data Analysis: Boris Mirkin’s Way. Data Analysis and Optimization. In honor of Boris Mirkin's 80th Birthday. Springer Optimization and Its Applications, 202: 147-156.
  3. ^ Mirkin, B. G. (1972). Description of some relations on the set of real-line intervals. Journal of Mathematical Psychology, 9(2), 243-252. (see also in Russian, Миркин, Б. Г. (1970). Об одной аксиоме математической теории полезности. Кибернетика, (6.), Киев.).
  4. ^ Mirkin, B. G., & Chernyi, L. B. (1970). Measurement of the distance between distinct partitions of a finite set of objects. Autom Tel, 5, 120-127.(see Миркин, Б. Г., & Черный, Л. Б. (1970). Об измерении близости между различными разбиениями конечного множества объектов.—«Автоматика и телемеханика», 1970. Automation and Remote Control, 5(5), 786-792.) See also Миркин Б.Г. (1969) Об одном подходе к обработке нечисловых данных, Математические методы моделирования экономических задач (An approach to the analysis of non-numeric data), Новосибирск, ИЭиОПП СО АН СССР, 141-150.
  5. ^ Mirkin, B. G. (1975). On the problem of reconciling partitions. Quantitative sociology, international perspectives on mathematical and statistical modelling, 441-449.
  6. ^ Mirkin, B. G. (1982). Federations and transitive group choice. Mathematical Social Sciences, 2(1), 35-38. This is a synopsis of: Миркин, Б. Г. (1979). Федерации и транзитивность группового выбора. Модели социально-экономических процессов и социальное планирование. М.: Наука, 104-119.
  7. ^ B.G. Mirkin (1979) Group choice. J. Wiley & Sons (Translated from Russian original: Б.Г. Миркин (1974) Проблема группового выбора, М. Физматгиз).
  8. ^ a b B.G. Mirkin (1979) Group choice. J. Wiley & Sons (Translated from Russian original: Б.Г. Миркин (1974) Проблема группового выбора, М. Физматгиз).
  9. ^ See monographs by B. Mirkin in Russian: Б.Г. Миркин (1976) Анализ качественных признаков, М., Статистика (Analysis of categorical features); Б.Г. Миркин (1980) Анализ качественных признаков и структур, М., Финансы и статистика (Analysis of categorical features and structures); Б.Г. Миркин (1985) Группировки в социально-экономических исследованиях. М.. Финансы и статистика (Groupings in socioeconomic research).
  10. ^ See collections: Модели анализа данных и принятия решений, Новосибирск, ИЭиОПП СО АН СССР, 1980 (Methods for data analysis and decision making); Методы анализа многомерной экономической информации, Новосибирск, Наука, 1981 (Methods for analysis of multivariate data in economics).
  11. ^ Mirkin, B. G. (1987). Additive clustering and qualitative factor analysis methods for similarity matrices. Journal of Classification, 4, 7-31;  Mirkin, B. G. (1990). A sequential fitting procedure for linear data analysis models. Journal of Classification, 7, 167-195; Mirkin, B. (1996). Mathematical classification and clustering (Vol. 11). Springer Science & Business Media.
  12. ^ See: (i) Миркин, Б. Г. (1987). Метод главных кластеров. Автоматика и телемеханика, (10), 131-143 (Method for principal cluster analysis); (ii) Chiang, M. M. T., & Mirkin, B. (2010). Intelligent choice of the number of clusters in k-means clustering: an experimental study with different cluster spreads. Journal of classification, 27, 3-40; (iii) De Amorim, R. C., & Mirkin, B. (2012). Minkowski metric, feature weighting and anomalous cluster initializing in K-Means clustering. Pattern Recognition, 45(3), 1061-1075; (iv) Mirkin, B., & Nascimento, S. (2012). Additive spectral method for fuzzy cluster analysis of similarity data including community structure and affinity matrices. Information Sciences, 183(1), 16-34.
  13. ^ (v) Mirkin, B. and Shalileh, S., 2022. Community detection in feature-rich networks using data recovery approach. Journal of Classification, 39(3), pp.432-462
  14. ^ a b Mirkin, B., Muchnik, I., & Smith, T. F. (1995). A biologically consistent model for comparing molecular phylogenies. Journal of Computational Biology, 2(4), 493-507; Eulenstein, O., Mirkin, B., & Vingron, M. (1998). Duplication-based measures of difference between gene and species trees. Journal of Computational Biology, 5(1), 135-148.
  15. ^ a b Mirkin, B. (2005). Clustering for data mining: a data recovery approach. Chapman and Hall/CRC.
  16. ^ Benzécri, J. P. (1977). Histoire et préhistoire de l'analyse des données. Partie V L'analyse des correspondances. Cahiers de l'analyse des données, 2(1), 9-40.
  17. ^ Mirkin, B. (2001). Eleven ways to look at the chi-squared coefficient for contingency tables. The American Statistician, 55(2), 111-120; Lebart, L., & Mirkin, B. G. (1993). Correspondence analysis and classification. In Multivariate Analysis: Future Directions 2 (pp. 341-357). North-Holland.
  18. ^ Mirkin, B. G., Fenner, T. I., Galperin, M. Y., & Koonin, E. V. (2003). Algorithms for computing parsimonious evolutionary scenarios for genome evolution, the last universal common ancestor and dominance of horizontal gene transfer in the evolution of prokaryotes. BMC evolutionary biology, 3, 1-34.
  19. ^ Makarova, K., Slesarev, A., Wolf, Y., Sorokin, A., Mirkin, B., Koonin, E., ... & Mills, D. (2006). Comparative genomics of the lactic acid bacteria. Proceedings of the National Academy of Sciences, 103(42), 15611-15616.
  20. ^ Mirkin, B. G., & Shestakov, A. (2013). Least square consensus clustering: criteria, methods, experiments. In Advances in Information Retrieval: 35th European Conference on IR Research, ECIR 2013, Moscow, Russia, March 24-27, 2013. Proceedings 35 (pp. 764-767). Springer Berlin Heidelberg.
  21. ^ Nascimento, S., Martins, A., Relvas, P., Luís, J. F., & Mirkin, B. (2023). Core–shell clustering approach for detection and analysis of coastal upwelling. Computers & Geosciences, 179, 105421.
  22. ^ Frolov, D., Nascimento, S., Fenner, T., & Mirkin, B. (2020). Parsimonious generalization of fuzzy thematic sets in taxonomies applied to the analysis of tendencies of research in data science. Information Sciences, 512, 595-615.
  23. ^ Frolov, D., Taran, Z., & Mirkin, B. (2020). A method for audience extending in programmatic advertising by using parsimonious generalization of user segments. In Human Interaction and Emerging Technologies: Proceedings of the 1st International Conference on Human Interaction and Emerging Technologies (IHIET 2019), August 22-24, 2019, Nice, France (pp. 837-841). Springer International Publishing.