Main Article Content



Objectives: In December 2019, in Wuhan, China, a novel coronavirus disease (COVID-19), a highly infectious disease, was first described. The disease has spread to 210 countries and territories across the world and more than two million people have been infected (confirmed). In India, the disease was first detected on 30 January 2020 in Kerala in a student who returned from Wuhan. The disease has been continuously spreading all the state of India. The main objective of this study was to identify and classify affected districts into real clusters on the basis of observations of similarities within a cluster and dissimilarities among different clusters so that government policies, decisions, medical facilities (ventilators, testing kits, masks, treatment etc.), etc. could be improved for reducing the number of infected and deceased persons and hence cured cased could be increased.

Materials and Methods: We concentrated on the COVID-19 affected states and UTs of India in the report. To fulfill the task, we applied cluster analysis, one of the data mining techniques. The study of variations among various clusters for each of the variables was performed using box plots. We used PAST software for getting for getting a scatter plot for each of the variables.

Results: Results obtained from the clustering analysis and box plot methods for each of the variables. For confirmed cases, cluster I corresponded to the states AP, AR, AS, BR, CG, GA, GJ, HR, HP, JH, KA, KL, MP, MH, MN, ML, MZ, NL, OR, PB, RJ, SK, TN, TG, TR, UP, UK, WB, AN, CH, DNDD, DL, JK, LA, LD, PY. For cured cases, cluster II and for death cases, cluster III corresponded to all the states and UTs of India.

Conclusions: The study showed that the state MH, AP, AR, DL and KL under cluster I have a high number of confirmed cases. The box plots and histogram shows variations among different clusters of the three cases. The trend in box plots and histograms showed a good percentage of cured cases in some of the states and UTs. It was observed that the states (MH, UP, KR, TN, DL and WB) under clusters III had severe conditions which need optimization of monitoring techniques which could help the government in making improvement government policies, actions, etc. to reduce the number of infected persons.


Coronavirus disease-19, India, cluster analysis, box plot, data mining.

Article Details

How to Cite
ZARGAR, S. A., ISLAM, T., REHMAN, I. U., & PANDEY, D. (2021). USE OF CLUSTER ANALYSIS TO MONITOR NOVEL CORONA VIRUS (COVID-19) INFECTIONS IN INDIA. Asian Journal of Advances in Medical Science, 3(2), 1-7. Retrieved from
Original Research Article


Li Q, Guan X, Wu P, Wang X, Zhou L, Tong Y. Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia. N Engl J Med; 2020.
Available:, 2020.

Huang C, Wang Y, Li X, Ren L, Zhao J, Hu Y, et al. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet.; 2020.

World Health Organization. Coronavirus. World Health Organization; 2020.

Zhou P, Yang XL, Wang XG, Hu B, Zhang L, Zhang W. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature;2020.

Gulia A, Panda PK, Parikh P. India and COVID-19 pandemic- standing at crossroad! Indian J Med Sci. 2020;72:1-2.

Hegarty PK, Service NH, Kamat AM, Dinardo A. BCG vaccination may be protective against Covid-19; 2020.

Bacille Calmette-Guérin (BCG) vaccination and COVID-19; 2020.
Available: room/commentaries/detail/bacille-calmette-guérin-(bcg)-vaccination-and-covid-19. (accessed on 21 December 2020).

Johns Hopkins Coronavirus Resource Center; 2020.

Gun AM, Gupta MK, Dasgupta B. Fundamentals of Statistics. Kolkata: World Press Private. 2008;1.

Abraham P, Aggarwal N, Babu GR, Barani S, Bhargava B, Bhatnagar T, et al. Laboratory surveillance for SARS-CoV-2 in India: Performance of testing and descriptive epidemiology of detected COVID-19. Indian J Med Res 2020;151:23640.
Available: preprintarticle. asp?id=285361. (accessed on 03 December 2020).

Abere OJ. Survival Analysis of Novel Corona Virus (2019-Ncov) Using Nelson Aalen Survival Estimate. International Journal 0f Business Education And Management Studies. 2020;3(1):30-40.

Kumar S. Use of cluster analysis to monitor novel coronavirus-19 infections in Maharashtra, India. Indian Journal of Medical Sciences. 2020;72(2):44.

Das S. Prediction of COVID-19 Disease Progression in India: Under the Effect of National Lockdown; 2020. (accessed on 21 December 2020).

Ray D, Salvatore M, Bhattacharyya R, et al. Predictions, role of interventions and effects of a historic national lockdown in India’s response to the COVID-19 pandemic: data science call to arms. medRxiv; 2020;

MoHFW | Home; 2020.
Available: (accessed on 21 December 2020).

India Yearbook. Publications Division, Ministry of Information & Broadcasting, Govt. Of India; 2007. ISBN 978-81-230-1423-4.

India. Encyclopædia Britannica. Retrieved 17 July 2012. Total area excludes disputed territories not under Indian control; 2020.

India at a Glance: Area. Ministry of Home Affairs: Government of India; 2001. (accessed 9 December 2020).

Jammu and Kashmir - CIA (PDF). Central Intelligence Agency; 2002. (accessed on 9 December 2020.

Dilts D, Khamalah J, Plotkin A. Using cluster analysis for medical resource decision making. Med Decis Mak. 1995;15:333-47.

McLachlan GJ. Cluster analysis and related techniques in medical research. Stat Methods Med Res. 1992;1:27-48.

Romesburg HC. Cluster Analysis for Researchers. Belmont: Lifetime Learning Publications; 1984.

Ward JH. Hierarchical grouping to optimize an objective function. J Am Stat Assoc 1963;58:236-46.

Ministry of Health and Family Welfare. Government of India; 2020.
Available from: (accessed on 03 December 2020].

Reuters. Over 90,000 Health Workers Infected with Covid-19 Worldwide: Nurses Group; 2020. Available: (accessed on 03 December 2020).

Most read articles by the same author(s)