Aparna Varde

Associate Professor, Computer Science

Richardson Hall 305
BE, University of Bombay
MS, Worcester Polytechnic Institute
PhD, Worcester Polytechnic Institute
Dr. Aparna Varde is a Tenured Associate Professor in the Department of Computer Science at Montclair State University, NJ, USA. She obtained her Ph.D. and M.S. in Computer Science, from Worcester Polytechnic Institute, MA, USA; and her B.E. in Computer Engineering from the University of Bombay, India. She was awarded an Associate Membership of Sigma Xi, the Scientific Research Society in 2005 for excellence in multidisciplinary work. Her research spans data mining, databases and artificial intelligence with over 80 publications and 2 software trademarks. Dr. Varde has co-chaired Ph.D. workshops/forums in ACM CIKM 2007, 2008, 2010, 2012 and 2014 and IEEE ICDM 2013. She has served on the PC of various conferences, e.g., ACM's CIKM & EDBT, IEEE's ICDM, SIAM's SDM, Springer's DEXA and has been a reviewer for journals including IEEE's TDKE, ACM's TKDD, Elsevier's DKE, Springer's DMKD and ACM's VLDB journal. She has been the dissertation advisor for two Ph.D. students in Environmental Management as Doctoral Faculty in that program and the research advisor for M.S. and B.S. students in Computer Science. She has also been an advisor for a Visiting Fullbright PhD Scholar in Computer Science at Montlair and an external committee member for two Ph.D. students in Computer Science from Queensland University of Technology, Australia. Dr. Varde has served as a panelist for NSF's Cyber-enabled Discovery and Innovations Program (CDI) through their division of Information and Intelligent Systems (IIS). Her research is supported by grants from organizations such as PSE&G and NSF, USA. Her prior academic experience includes being a Tenure Track Assistant Professor in the Department of Math and Computer Science at Virginia State University, USA; and a Visiting Senior Researcher at the Max Planck Institute for Informatics, Germany. She also has industrial experience mostly in multi-national companies such as Lucent Technologies and Citicorp. Dr. Varde is classified as an outstanding researcher by the Citizenship and Immigration Services, USA.


Data Mining - Predictive Analytics, Decision Support Systems, Scientific Data Mining, Text Mining and Linguistics Issues
Artificial Intelligence - Common Sense Knowledge, Smart Cities, Machine Learning, Information Retrieval
Database Systems - Web Databases, XML, Cloud Computing, Big Data Management
Environmental Management (Doctoral Program) - Green IT, Urban Policy, Geo-informatics


Research Projects

Decision Support in Green Information Technology

This multidisciplinary research in data mining and environmental management is supported by a grant from PSE&G. It involves investigating greener solutions for data centers with the goals of energy efficiency and adequate performance. The role played by data mining techniques is significant here in the development of a decision support system GreenDSS that will assist IT managers to head towards green computing in their respective data centers. This grant has supported a Ph.D. student Michael Pawlish in Environmental Management with Dr. Varde as the dissertation advisor in her capacity as Doctoral Faculty Member in that Program. It has led to publications in ACM's SIGMOD Record Journal, IJCAC journal, IEEE's ICIAFS, ACM's CIKM workshops, IEEE's ICDM workshops and various other multi-disciplinary venues. Further work emerging from this research on the use of cloud and hybrid models for green business solutions along with the DevOps (development and operations) paradigm appeared in ACM's SIGKDD Explorations journal. Results from this work have actually been used by PSE&G and Montclair State University for developing greener data centers.
PhD Student: Michael Pawlish (Graduated: May 2014)
Funding: PSEG Research Grant (2011 to 2013)

Terminology Evolution in Information Retrieval

This research in the overall area of Web and text mining started as joint work with Max Planck Institute, Germany where Dr. Varde was a Visiting Senior Researcher. The goal of this project is to detect evolving terminology in responding to user queries on the Web by mining existing text archives. This is in order to enhance information retrieval by incorporating historical information on terms contained in queries. This led to a Masters' Project by a CS graduate student Debjani Roychoudhury and a Masters' Thesis by a CS graduate student Amal Kalurachchi. It has been published in AAAI, ACM's EDBT and ACM's CIKM conferences.
M.S. Thesis Student: Amal Kaluarachchi (Graduated: May 2010)
M.S. Project Student: Debjani Roychoudhury (Graduated: May 2009)
Funding: Faculty Research Visit at Max Planck Institute, Germany (2008)

Learning By Mining Nanoscale Images

This work is funded by a grant from NSF REU and supports undergraduate students from the tri-state area during summers. The focus of this grant is in the area of image processing and my contribution is in the area of learning from image data at the nanoscale level. The work entails proposing and implementing techniques for discovering knowledge from image data useful in domain-specific decision-making. This project involves real data obtained from researchers in Nanotechnology, used for running experiments with the proposed techniques. It has real-world applications such as drawing conclusions from biological images based on automating comparisons between them by learning suitable notions of similarity. This has the broader impact of catering to areas such as health informatics. For example, the results of the learning process are useful in finding a cheaper material instead of a more expensive material to develop a human body implant, if both materials yield similar results as evident from image similarity search. This is given the fact that these images are generated from real experimental Publications from this work include a paper in SPIE 2010 conference, a presentation in ACM CCSC 2010 conference, and a paper in ICML 2010 Workshops.
Summer Research Student: Gregory Roughton (Completed: July 2009)
Summer Research Student: Daniel Jackowitz (Completed: July 2010)
Funding: NSF REU Grant (2009 to 2010)

Addressing Articles, Collocations and Prepositions in L2 English Text and Machine Translation

This research in the area of text mining and computational linguistics. It involves the classification of article errors, correction of odd collocations and prediction of preposition usage in texts written by L2 (non-native) English speakers and in automated translation to English by machines. Article errors pertain to entering articles where not needed, omitting articles where needed and entering the wrong article. Odd collocations involve using incorrect combination of terms such as powerful tea when the user actually means strong tea. Preposition prediction involves suggesting an appropriate preposition in a sentence. This is useful in writing aids typically designed for ESL learners. Mining the concerned text and deploying machine learning techniques such as classification and ensemble learning play an important role here. Related publications include conference papers in AAAI's FLAIRS 2010, IEEE's ICICS 2013 and a journal paper in ACM SIGKDD Explorations journal 2015. A research tutorial in ACM CIKM 2017 encompassed the collocations component of this work.
M.S. Thesis Student: Alan Varghese (Graduated May 2013)
M.S. Project Student: Aliva Pradhan (Graduated May 2011)
M.S. Project Student: Pooja Bhagat (Graduated May 2014)

XML-based Markup Languages and Semantic Web Standards

This work constitutes the use of XML and other standards in Web development for various real world applications. It is partly supported by a SHIP grant through Roche and Merck to fund Honors students in BS degree programs in various science disciplines who are expected to work with their respective faculty mentors to complete an undergraduate thesis in a concerned area. My role as faculty mentor is to work with a BS student in Information Technology on a specific research project, namely, XML-based markup languages and Cloud Computing in management of EHR (Electronic Health Records). We are investigating the use of the medical markup language MML, which constitutes a DSML (Domain Specific Markup Language) for storing and exchanging health records, proposing techniques for knowledge discovery over such XML based standards and also investigating the use of cloud technology in storage, retrieval and knowledge discovery pertaining to healthcare taking into account issues such as cost-effectiveness, risk analysis and scalability. Another aspect of this work includes the use of RDF, OWL and SPARQL for meta knowledge extraction in an application pertaining to university systems, helpful in ubiquitous computing by using Semantic Web standards. This constitutes the M.S. Project work of a student and has been published in IEEE UEMCON 2016, with a best paper award in one of the tracks. Other related publications include a paper in the IEEE ICDM 2011 conference in their KDCloud workshop and another one in IEEE's ICIAFS conference. Research tutorials that entail the DSML part of this work and related research in XML, the Hidden Web and the Semantic Web have been given in Springer's DASFAA 2009 and ACM's EDBT 2011 conferences.
M.S. Project Student: Aliva Pradhan (Graduated May 2011)
B.S. Honors Student: Jonathan Tancer (Graduated May 2012)
Funding: Science Honors Innovation Program (2010 onwards)

Cloud Computing in Big Data and Social Media

This project focuses on research in cloud computing with emphasis on managing and mining big data. Besides a thorough investigation of existing methodologies, it addresses the design and implementation of novel techniques and the enhancement of existing approaches for big data management and mining on the cloud. The project involves exploratory research with cloud technologies such as Hadoop, Hive and Mahout for big data. Various real world data sets are used in the context of areas such as scientific data management. Predictive analysis on the cloud is also conducted deploying machine learning algorithms in Mahout with specific reference to text classification, recommender systems and decision support. This project also involves opinion mining over cloud-based social media such as Twitter, where results of sentiment analysis are useful in applications such as recommenders. This has led to publications in the NJBDA Symposium, ACM CIKM's CloudDB 2013 and IEEE ICDM's KDCloud 2013. A research tutorial has been presented at the DASFAA 2015 conference based on some outcomes of this work and related work by other researchers in the area.
M.S. Project Student: Klavdiya Hammond (Graduated May 2013)
M.S. Project Student: Shireesha Chandra (Graduated May 2012)
M.S. Project Student: Ketaki Gandhe (Graduated May 2015)

GIS, Urban Sprawl and Air Quality Issues

This research spans Geographic Information Systems (GIS), Urban Sprawl and Air Quality. This entails mining spatial and temporal data to predict urban sprawl. It employs association rules with domain knowledge to discover relationships between various sprawl causing parameters such as unemployment, traffic, demographics, pollution etc., the impact of such parameters on sprawl and vice versa. It also involves dealing with pollution issues for further analysis on its relationship with urban sprawl. This is with the goal of air quality assessment incorporating public health factors using EPA standards. Data mining on multi-city data worldwide is conducted using classical techniques of association rule mining, clustering and classification to discover various causes of pollution in urban areas and predict air quality based on the analysis. This work also includes a social media mining component wherein reactions expressed by the public on pollution causing factors and its related solution mechanisms are assessed. Important outcomes of this work are prototype tools for sprawl prediction and for air quality assessment. Some of this research entails the early dissertation work of the PhD student Xu Du. This work has been published in KDD 2014 Bloomberg track, ICDE 2016 workshops and other venues.
PhD Student: Xu Du (ongoing)
M.S. Project Student: Anita Pampoore-Thampi (Graduated May 2014)
Funding: DA from Environmental Management Program

Urban Policy and Common Sense Knowledge for Smart Cities

This project involves knowledge discovery from urban policy through modeling and mining of ordinances or local laws considering factors such as frequency, duration, attention to issues in sessions etc. A very important aspect is finding relationships between ordinances and smart city characteristics incorporating common sense knowledge (CSK) to capture human judgment in the mapping. The objective is to provide feedback to urban agencies on how well they are managing the given urban region and how much that region heads towards being a smart city. Another part of this work focuses on opinion mining over public reactions to the respective ordinances as expressed on social media through tweets. This involves polarity classification based on various levels of sentiments. This sentiment analysis indicates the satisfaction of the common public regarding the ordinances in their respective urban region and provides further assessment on how close this region is to being a smart city. The involvement of the public through such means is itself an aspect of the smart government characteristic through greater transparency in governance. Challenges include dealing with natural language in ordinances and tweets in addition to informal grammar, acronyms etc. in tweets which also entails the use of CSK. The source of CSK in this research is a worldwide repository called WebChild developed at Max Planck Institute for Informatics (MPII), Germany. Early work on this project started in 2015 during a faculty research visit to MPII. This has so far been published in a Tech Report of MPII 2015, IEEE UEMCON 2017 and W3C's WWW 2018. More work is in progress.
PhD Student: Xu Du (ongoing)
M.S. Thesis Student: Manish Puri (ongoing)
Funding: Faculty Research Visit at Max Planck Institute (Germany);
DA from Environmental Management; RA from Computer Science

Common Sense in Implicit Requirements for Smart City Tools and Autonomous Vehicles for Smart Mobility

This work is in the general area of deploying common sense knowledge (CSK) in smart cities. One aspect entails the use of CSK in the identification and management of implicit requirements (IMRs) during the requirements specification phase in Software Engineering. As opposed to explicit requirements clearly outlined by users, implicit ones are more subtle and needed to be inferred to ensure the success of software development. A framework integrating CSK, Ontology and Text Mining is built in this research to address implicit requirements. This framework would be useful in implementing smart city tools by identifying IMRs from a user perspective. This part of the work has been published in DMIN 2016 with more publications ongoing. Yet another aspect of this work deals with the deployment of CSK in the smart mobility characteristic of smart cities. More specifically, it involves embedding CSK in autonomous vehicles to enable them make more well-informed decisions analogous to excellent human drivers. This work aims to enhance automated driving especially with reference to safety related issues. It also involves augmenting object detection with commonsense knowledge, especially with the potential of being usable in autonomous vehicles. This work is in its relatively early stages and has been published in the IEEE ICTAI 2017 workshop on Smart Cities. Much of this work has emerged from a project on commonsense knowledge in domain-specific knowledge bases initiated during a faculty research visit to Max Planck Institute for Informatics, Germany in 2015.
PhD Student: Onyeka Emebo (Visiting Fullbright Scholar 2015-2016)
M.S. Project Student: Abidha Pandey (ongoing)
B.S. Honors Student: Priya Persaud (Graduated May 2017)
Funding: Fullbright Scholarship Program; NSF LSAMP program; Faculty Research Visit to Max Planck Institute, Germany