AN ABSTRACT OF THE THESIS OF 
          Seungbin Oh        for the          Master of Science                                           
in     Art Therapy Counseling    presented on       March 26, 2015             
Title: The Cross-Cultural Utility of the Formal Elements Art Therapy Scales and Person  
Picking an Apple from a Tree                                                   
Abstract approved:________________________________________________________   
 The purpose of this study was to contribute to the establishment of empirical data 
to support the cross-cultural use of art therapy assessment by looking at one art therapy 
assessment in particular, Formal Elements Art Therapy Scale (FEATS) used with Person 
Picking an Apple from a Tree (PPAT). This research was designed to identify whether the 
FEATS instrument in coordination with the PPAT (Gantt & Tabone, 1998) would be a 
reliable art therapy assessment in a cross-cultural context by obtaining normative data 
through testing Asian and American participants and using Asian and American raters.  
 The first hypothesis stated there would be cross-cultural reliability of the 
assessment instrument, the FEATS, between the Asian and American participant and rater 
groups. The second hypothesis was that the normative statistics obtained in this study 
would be consistent with the originators’ (Gantt & Tabone, 1998) predictions about non-
 patient drawings and with normative statistics obtained from previous research 
(Bucciarelli, 2011; Nan and Hinz, 2012). The last hypothesis was that there would be no 
difference in the scores of the two college student groups, Americans and Asians, on the 
majority of scales for the FEATS assessment.  
 The research was conducted with a total of 114 participants from both Asian and 
American cultural groups with equal numbers from each demographic. Participants were 
 
selected from undergraduate classes and student communities at a mid-sized public 
university in the United States. Asian and American participants completed the PPAT task, 
and their drawings were scored by a group of Asian raters and a group of American raters 
to examine interrater reliability and to provide normative data for both cultural groups. 
Data was analyzed using statistical tests including Pearson’s correlation and t-test. 
Results of this study supported the cross-cultural reliability of the FEATS with PPAT 
drawings for both Asian and American cultural groups. Future implications and 
recommendations are offered to improve the rigor of art therapy assessment research and 
future normative studies. 
 Keywords: art therapy, assessment, cross-cultural utility, FEATS, PPAT  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
THE CROSS-CULTURAL UTILITY OF THE FORMAL ELEMENTS ART THERAPY 
SCALES AND PERSON PICKING AN APPLE FROM A TREE 
 
 
 
 
_________ 
A Thesis 
Presented to the Department of Counselor Education  
EMPORIA STATE UNIVERSITY 
_________ 
In Partial Fulfillment 
Of the Requirement for the Degree 
Master of Science 
_________ 
 
 
 
 
by 
Seungbin Oh 
May 2015 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
   ______________________________________ 
           Approved by the Department Chair 
 
   ______________________________________ 
           Committee Chair 
                           Gaelynn P. Wolf Bordonaro, Ph. D., ATR-BC 
                                                                 
______________________________________ 
   Committee Member 
   Mingchu Luo, Ed. D.  
    
        ______________________________________ 
   Committee Member 
   Jessica Woolhiser Stallings, MS, ATR-BC, LPC 
 
   _______________________________________ 
Dean of the Graduate School and Distance   
Education
                                                                     iii 
 
ACKNOWLEDGMENTS 
 This thesis was in no way a solitary effort and I would like to thank everyone 
who helped me on this tough journey. First and foremost, my committee chair Dr. 
Gaelynn Wolf Bordonaro, for her time and guidance. I also thank my committee members, 
Jessica Woolhiser Stallings, for her willingness to help me out, and Dr. Mingchu Luo, for 
his guidance of statistical description. 
 I could not have completed this thesis without the support of my family. I would 
like to thank all my family members. This thesis would have been impossible without Lu 
Lee, my sweetie, who was my source of strength and encouragement, and who sent me a 
lot of sweet snacks to ensure that I survived on in this tough journey.  
  
“I am a pessimist because of intelligence, but an optimist because of will” 
Antonio Gramsci  
  
                                                                     iv 
 
TABLE OF CONTENTS 
Page  
ACKNOWLEDGMENTS ................................................................................................. iii 
TABLE OF CONTENTS ................................................................................................... iv 
LIST OF TABLES ............................................................................................................ vii 
LIST OF FIGURES ......................................................................................................... viii 
CHAPTER 
1 INTRODUCTION ............................................................................................................1 
Literature Review ...................................................................................................3  
Multiculturalism .....................................................................................................3  
Justification for Multiculturalism..............................................................4 
The Importance of Multiculturalism in the United States .........................5 
The Effect of Multiculturalism on Academic Fields .................................5 
The Effect of Multiculturalism on Mental Health Field ...........................6 
The Effect of Multiculturalism on Art Therapy Field ...............................6 
Art and Cultural Competency .................................................................................7 
The Universality of Art across Cultures ....................................................7 
The Cultural Particularity of Art ...............................................................8 
Art Therapy ............................................................................................................8 
Expectation of Multicultural Competency ................................................9 
International Art Therapy ..........................................................................9 
Art Therapy in Asia ................................................................... 11 
Art-based Assessment with Cultural Competency ............................................... 11 
                                                                     v 
 
Cultural Sensitivity and Relevance of Art-based Assessment ................12 
Import of Research on Cultural Competency of Art Therapy  
Assessment ................................................................................13 
Formal Elements Art Therapy Scale and Person Picking an Apple from a Tree ..14 
The Person Picking an Apple from a Tree ..............................................14 
The Formal Elements Art Therapy Scales ..............................................15 
Reliability and Validity of the FEATS with the PPAT in Clinical Setting .............20 
The Normative Study and Data of the FEATS with the PPAT ...............................24 
The Cross Cultural Reliability and Validity of the FEATS with the PPAT ............26 
Summary and Hypotheses......................................................................................29 
2 METHOD .......................................................................................................................31 
 Participant……………………………………………………………………….31 
The Sample .............................................................................................31 
Selection Methods ...................................................................................34 
Research Design and Instrument ..........................................................................35 
Logistics ..................................................................................................35 
The PPAT ................................................................................................35 
The FEATS ..............................................................................................35 
Demographic Questionnaire ...................................................................36 
Procedure ..............................................................................................................36 
IRB Approval and Informed Consent .....................................................36 
Administration ........................................................................................37 
Rater and Rating Procedure ....................................................................37 
                                                                     vi 
 
Data Analysis ........................................................................................................38 
3 RESULTS ........................................................................................................................40 
Interrater Reliability ...............................................................................................40 
Normative Data ......................................................................................................42 
Independent t-test ...................................................................................................44 
4 DISCUSSION .................................................................................................................47 
Hypothesis One ......................................................................................................48 
Hypothesis Two......................................................................................................50 
Hypothesis Three ...................................................................................................55 
Challenges ..............................................................................................................60 
Limitations .............................................................................................................60 
Future Implications and Conclusion ......................................................................61 
REFERENCES ..................................................................................................................63 
APPENDICES ...................................................................................................................70 
Appendix A: Example of Person Picking an Apple from a Tree ..........................70 
Appendix B: Formal Elements Art Therapy Scales .............................................72 
Appendix C: Demographic Questionnaire ...........................................................86 
Appendix D: Emporia State University IRB Approval Letter ..............................88 
Appendix E: Informed Consent ............................................................................90 
Appendix F: Formal Elements Art Therapy Scales Rating Sheet ........................92 
  
                                                                     vii 
 
LIST OF TABLES 
TABLE               PAGE 
1     Summary of Characteristics of Participants ...........................................................33 
2     Inter-rater Reliability Correlations for the FEATS ................................................41 
3     Normative Statistics of the 14 FEATS Scales for Asian and American .................43 
4     t-test Results of the 14 FEATS Scales by the Asian and American  
Sample Group ........................................................................................................46 
  
                                                                     viii 
 
LIST OF FIGURES 
FIGURE              PAGE 
1      Example Showing the Rating Problem for the Rotation Scale ............................49 
2      PPAT Drawing Demonstrating an Unrealistic Person’s Hand Size .....................52 
3      Example of Creativity and Playfulness in a PPAT Drawing ................................53 
4      Example of an Asian Participant’s Drawing Demonstrating a Close Visual  
       Relationship between Persons, Tree, and House .................................................57 
5      Example of an American Participant’s Drawing Demonstrating a Focus on  
Separate Elements rather than a Relationship between the Person and Tree .......59
1 
 
CHAPTER 1 
INTRODUCTION  
 Multiculturalism, a political philosophy about the appropriate commitment to 
cultural and religious diversity and to changing dominant patterns of representation that 
marginalize certain groups, is increasingly important in the contemporary world 
(Gutmann, 2003; Taylor, 1992; Young, 1990). The growing import of multiculturalism is 
rooted in and gained justification primarily from the violent tendency of Western cultural 
imperialism inherent in colonialism in the 20th century that significantly undermined 
human welfare and spirit (Gutmann, 2003; Song, 2007).  
 Mental health professions, including art therapy, are the most responsive fields to 
the growing import of multiculturalism. The increasing influence of multiculturalism on 
mental health treatment represents not only a desirable challenge that counselors and 
therapists become competent in multicultural treatment, but also this increasing influence 
provides a serious concern for mental health practice; including counseling theories, the 
role of the counselor or therapist, treatment interventions, and techniques that have 
evolved from Western Euro-centric values and philosophies and perpetuate cultural 
imperialism to clients from minority cultural groups (Arrington & Yorgin, 2001; Betts, 
2013; Hocoy, 2002; Sue & Sue, 1999). 
Psychological assessments, instruments designed to help clinicians understand 
clients, are of particular concern as possible agents of cultural imperialism that 
marginalize and stigmatize minority groups with flagrant labels of mental illness 
(Johnson, 2001; Reynolds & Suzuki, 2012). There have been serious questions about the 
fact that many assessments derive from a set of cultural assumptions, values, and 
2 
 
constructions that are uniquely Euro-American in origin (Hocoy, 2002; Johnson, 2001; 
Reynolds & Suzuki, 2012). Specifically, art-based projective assessments have become 
the center of concern and controversy due to their unique nature; although less distorted 
by linguistic expression, results are frequently misunderstood as a secret interpretation of 
the symbolism and content of a client’s culture in art response (Betts, 2013; Hocoy, 2002).  
Since antiquity, art has existed in every culture, and art has been regarded as a 
universal form of communication and a common medium of expression (Dissanayeke, 
1995). As a result, art-based assessments, which largely depend on the assumption of the 
universality of art, have been considered less culturally bound (Hocoy, 2002; Williams, 
French, Picthall-French, and Flagg-Williams, 2011). However, the universality of art does 
not prove that art-based assessment tools are culturally valid. In fact, art-based 
assessments are not widely recognized as culturally legitimate or relevant instruments 
due to the lack of research and quantitative evidence (Betts, 2013; Feder & Feder, 1998; 
Hocoy, 2002). 
The lack of empirical evidence to support the assumptions of art-based 
assessments causes controversy regarding their cultural reliability. Furthermore, there is a 
rational concern that art-based assessments could serve as agents of cultural imperialism 
that stigmatize minority groups with labels of mental illness (Betts, 2013; Feder & Feder, 
1998). Therefore, establishing normative data through scientific research methodology is 
essential to support the reliability and validity of art-based assessments in cross-cultural 
settings. My research will contribute to the literature on the cross-cultural use of art-based 
assessment by looking at one art therapy assessment in particular, the Formal Elements 
Art Therapy Scale (FEATS) used with Person Picking an Apple from a Tree (PPAT).  
3 
 
Review of the Literature 
The purpose of this study was to establish normative data to support cross-
 cultural use of one art-based assessment by empirically examining its cross-cultural 
utility. This study examined the Formal Elements Art Therapy Scales (FEATS) with the 
use of the Person Picking an Apple from a Tree (PPAT) art directive. A brief history of the 
development of multiculturalism and its impact on mental health fields, including art 
therapy, will be presented in the literature review as well as issues surrounding the 
universality of art across cultures. The current dialogue regarding the cultural use of art-
 based assessments will be introduced. Finally, the literature review will provide a 
description of the FEATS and PPAT assessment, and the research conducted so far on its 
cross-cultural reliability and validity will be critically examined. The literature review 
ends with a brief summary of limitations and weaknesses in the existing research that led 
to my research question and informed the direction of my study.    
Multiculturalism 
Multiculturalism represents a broad range of thoughts in political philosophy 
about the appropriate approach to embrace cultural, religious, and ethnic diversity 
(Gutmann, 2003; Taylor, 1992; Young, 1990). Multiculturalism is a commitment to 
changing dominant patterns of representation and communication that marginalize certain 
groups (Gutmann, 2003; Taylor, 1992; Young, 1990). In the beginning, multiculturalism 
indicated a movement to recognize and accommodate cultures or cultural groups. Now, 
however, multiculturalism embraces a wide range of diversity including religion, 
language, ethnicity, and race (Song, 2008). 
4 
 
Justifications for multiculturalism. There are three distinct justifications for the 
development of multiculturalism: (a) the communitarian critique of liberalism, (b) 
compatibility with liberalism, and (c) postcolonial perspectives (Kymlicka, 1989; Taylor, 
1995). The first rationale for multiculturalism grows out of the communitarian critique of 
liberalism. Upholding liberalism advocates that individual freedoms and rights are more 
important than community life and collective goods. However, communitarians criticize 
the idea that the individual is the priority over the community. On the contrary, they give 
primacy to the value of the collective good, collective identity, and culture over 
individual freedoms, which facilitated the recognition of the equal worth of diverse 
cultures (Taylor, 1995).  
The second justification for multiculturalism arises from within liberalism, a 
political philosophy that is largely based on the values of autonomy and equality 
(Kymlicka, 1989). By prioritizing autonomy and equality, liberals cannot be bystanders in 
situations in which members of minority groups are disadvantaged; members of minority 
groups are disadvantaged by inequalities stemming from their involuntary membership in 
minority cultures. The liberal recognition of disparity between reality and political 
ideology encourage the collective responsibility of citizens to redress the inequalities and 
facilitate the growing development of multiculturalism (Kymlicka, 1989).  
Lastly, the late 19th and early 20th century saw rampant colonialism and other 
forms of fascism, such as Nazism, that discriminated against diverse cultures and races 
and even led to massacres of minority groups. The global trend of multiculturalism comes 
from the reflection on the appalling violence against different voices,  
5 
 
which provoked people in the global community to consider cultural, religious, and 
ethnic diversity (Song, 2008). 
The importance of multiculturalism in the United States. In the U.S.A, the 
timeliness and import of multiculturalism has dramatically increased with rapid 
diversification in population demographics across the nation. According to the 2012 
census projection, Caucasian people will no longer constitute a majority of Americans by 
2043; the non-Hispanic white population, now at 197.8 million, is projected to peak at 
200 million in 2024 (United States Census Bureau, 2012). An important implication of 
the demographic changes is that no major ethnic group or particular “cultural world view” 
will dominate the United States, but it will instead become a multicultural society in 
which a variety of ethnicities and cultures coexist. With the rapid diversification of the 
U.S. population, many academic fields have been increasingly and necessarily challenged 
to conduct research on multiculturalism as a solution to the challenges involved in the 
newly diverse society; increasing diversity can lead to less cohesiveness, less effective 
communication, increased anxiety, and greater discomfort for many members of a 
community (Hollinger, 1995). An increasingly diverse society adds momentum to the 
growing import of multiculturalism in the United States, and calls for preparation for the 
multicultural society (Betts, 2013; Hocoy, 2002; Song, 2008; Sue & Sue, 1999).  
The effect of research on the practice of multiculturalism. In response to new 
challenges, a variety of academic fields, including sociology, pedagogy, political science, 
and humanities, have been increasingly challenged to prepare people for a multicultural 
society. People need to learn how to live together with culturally and ethnically diverse 
citizens. Various academic fields have studied how to tolerate and respect racial diversity, 
6 
 
different cultural traditions, customs and language, and different religious customs as 
well as how to accommodate diversity in educational settings. A growing body of 
research has contributed to mutual understanding among different cultures and ethnicities, 
as well as minimizing the challenges and deriving maximum benefits from a multicultural 
society (Benhabib, 2002; Song, 2007). 
 The effect of multiculturalism on mental health fields. As in the broad range 
of academic fields, the import of multiculturalism is growing in mental health fields. 
With the strong indication of diversification, counselors and therapists are increasingly 
challenged to become multicultural treatment experts (Sue & Sue, 1999). Indeed, 
multiculturalism has been called the “fourth force” in helping professions, along with the 
other three forces, psychodynamic, humanistic and existential, and behavioral counseling 
theories and methods (Skovholt & Rivers, 2007). Knowledge and skills related to the 
fourth force, multiculturalism, are essential for understanding behaviors in the counseling 
process and for effective counseling in a multicultural society (Sue & Sue, 1999). 
 The effect of multiculturalism on the art therapy field. As a specialized 
mental health field, art therapists are also increasingly challenged to become culturally 
competent and useful to other cultures (Kaplan, 2003). Given the increasingly diverse 
society, working and training cross-culturally has become increasingly important. 
Furthermore, to achieve the maximum benefits of a multicultural society, the art therapy 
field has given primacy to the training of individuals from minority cultures, with the 
purpose of providing compatible therapists for these communities (Hocoy, 2002).  
 More importantly, the advent of multiculturalism has raised serious questions 
about the validity of art therapy itself in a cross-cultural context. Thanks to an assumption 
7 
 
on which art therapy largely depends, the hypothesis that art is a universal form of 
communication (Dissanayeke, 1995; Malchiodi, 1998; Rubin, 1999), art therapy had been 
relatively free from the accusations of Western cultural imperialism until multiculturalism 
emerged (Kaplan, 2003). It was as a response to multiculturalism that the art therapy field 
started to reflect on assumptions regarding art and researching aspects of art therapy that 
may be Euro-centric. In particular, the art therapy field has begun to examine problems 
that are inherent in the cross-cultural interpretation of art (Hocoy, 2002). 
Art and Multicultural Competency 
The universality of art across cultures. The idea of a universal “aesthetic” 
attempts to explain absolute beauty (Dissanayeke, 1995; McNiff, 1984). This can be 
confused with the universality of art. However, Dissanayeke (1995) argued the true 
universality of art across cultures is as an ethological view of art; only the behavior and 
function of art within this context are universal from primitive society to modern society. 
According to Dissanayeke, humans everywhere want to differentiate between a realm, 
mood, or state of being that is mundane, and that which is extra-ordinary. The demand of 
this distinction characterizes “universal predispositions of human behavior which are the 
core behaviors of art; art serves to make important things and activities special” 
(Dissanayeke, 1995, p. 39). Such “specialty” (Dissanayeke, 1995, p. 40) is associated 
with positive factors of care and concern. This suggests that art as a special activity or 
object appeals to emotion as well as perception and cognition; thereby serving all aspects 
of our mental functioning (Dissanayake, 1995). In that sense, she argues that art making 
serves as a normal and universal behavior of human beings; across cultures it is used to 
express complicated emotions and thoughts.  
8 
 
 The cultural particularity of art. It appears to be uncontroversial that art is a 
universal special activity that carries out special emotional and biological purposes 
(Dissanayeke, 1995; Malchiodi, 1998; McNiff, 1984; Rubin, 1999). In terms of form and 
content, however, the universality of art is debatable (Acton, 2001; Hocoy, 2002). Due to 
the nonverbal nature of art, there is an assumption that an art image has at least 
conceptual and construct equivalences across cultures (Acton, 2001). For example, 
McNiff (1984) asserted that universal formal elements, such as line, color, form, shape, 
composition, and movement, are universal in art. However, art images may have different 
conceptions and meanings in other cultures, since cultures have their own ways of 
categorizing phenomena and experiences. In fact, many studies (Acton, 2001; Hocoy, 
2002; Rubin, 1999) have demonstrated that interpretation of the meaning of images or 
forms is variable across cultures. If art images have significantly different conceptions 
and constructions in other cultures, we cannot exclude the possibility that art may be 
culturally situated rather than reflecting a particular dominant cultural worldview (Acton, 
2001; Betts, 2013; Hocoy, 2002; Rubin, 1999). 
Art Therapy 
Art therapy is a specialized mental health profession; it combines art and 
psychology to “promote self-awareness, change behavior, reduce anxiety, or increase 
self-esteem through the use of the creative process of art-making and the resulting 
artwork” (American Art Therapy Association, 2015, para. 1). Art therapy is considered  
an invaluable therapeutic tool that offers an alternative to verbal communication. Art 
therapy is appropriate for all individuals and groups, from children to older adults 
(Malchiodi, 2007). 
9 
 
 Expectation of multicultural competency. Art therapy is one of the most 
responsive professions to the growing import of multicultural competency (Betts, 2013; 
Hocoy, 2002). In an increasingly diverse society, art therapists are challenged to become 
culturally competent therapists (Arrington, 2005; Betts, 2013; Calisch, 2003; Hocoy, 
2002; Kaplan, 2003). In fact, art therapy is widely regarded as being less culturally bound 
than other therapeutic fields as it is less encumbered by linguistic expression (Cheryl, 
2006; Hocoy, 2002; Rubin 1999). McNiff (1984) emphasized art therapy’s cross-cultural 
utility, asserting the distinct universality of the art therapy process is grounded in its 
potential for in-depth exploration on a cross-cultural basis, which is impossible within 
more language-limited therapies.  
However, Johnson (2001), Moon (2010), and Hocoy (2002) warned that like 
other mental health fields, art therapy can also be culturally and historically situated. 
Johnson (2001) argued that art therapy derives from a specific set of cultural assumptions 
and values that are uniquely Euro-American in origin. With the awareness of these 
concerns about art therapy, many art therapy leaders and educators (Betts, 2013; Feder & 
Feder, 1998; Hocoy, 2002; Johnson, 2001; Moon, 2010) have suggested art therapists and 
students approach the development of cultural sensitivity through ongoing self-
 examination and identification of biases and cultural competency. Many art therapists 
work in cross-cultural and multicultural contexts; generally they are sensitive to fair and 
culturally relevant adaptations of their practices, but there is room for improvement 
(Hocoy, 2002). 
International art therapy. Early in the 20th century, art therapy emerged in the 
United States and Britain (Rubin, 1999). The American Art Therapy Association (AATA) 
10 
 
and the British Association of Art Therapists (BAAT) disseminated art therapy by 
actively pursuing the development of membership nationally and internationally. 
International students have been educated by art therapists from both the United States 
and Britain (Arrington, 2005; Cruz, 2005; Stoll, 2005). The international students have 
taken their newly acquired knowledge of art therapy to their homelands, which 
contributed to the growth of art therapy around the world.  
Art therapy is gradually becoming international and recognized in many different 
countries. The growing development of national art therapy organizations in areas around 
the world, including Australia, North America, South America, Europe, Scandinavia, the 
Middle East, and Asia, exemplifies the global recognition and growth of the field of art 
therapy (Stoll, 2005; Wolf Bordonaro, 2015). Art therapists are actively organizing in 
more than three dozen countries, and more than two dozen additional countries have 
established art therapy associations (Cruz, 2005; Stoll, 2005; Wolf Bordonaro, 2015). 
According to Wolf Bordonaro (2015), national art therapy associations contribute to the 
global growth of the field of art therapy by (a) providing communication among members; 
(b) disseminating research and practice information; (c) establishing educational and 
ethical standards; and (d) advocating for governmental recognition.  
Even though interest in art therapy is growing around the world, few countries 
have successfully established recognition of professional qualifications or have formal 
governmental recognition of art therapy (Wolf Bordonaro, 2015). In particular, art 
therapists from Europe, South America, the Middle East, and Asia face many challenges, 
including (a) establishing accredited educational programs, (b) developing a professional 
11 
 
scope of practice, and (c) gaining recognition by governments (Stoll, 2005; Wolf 
Bordonaro, 2015).  
Art therapy in Asia. The development of art therapy in Asia is as diverse as 
Asian countries themselves. Like other parts of the world, the uses of arts in healing and 
ritual are very much a part of Asia’s diverse cultures. From the mandalas of Tibetan 
Buddhism to the details of Chinese calligraphy, traditional arts in Asia have been used for 
thousands of years to inspire and educate, while also serving as a healing process or 
meditation. In this sense, art as a healing tool has been consistently familiar to most Asian 
countries (Debra, Siu, & Jordan, 2012).   
In the late 20th century, pioneer artists and mental health counselors attempted to 
integrate indigenous use of the arts for healing into art-based models and art therapy 
theory. In particular, pioneer art therapists who sought education abroad have served as a 
bridge to integrate the traditional uses and values of art in Asia with scientific theories of 
art therapy from the West. Nevertheless, most Asians are not aware of the existence of art 
therapy or art therapy as a discipline (Debra, Siu, & Jordan, 2012).  
Stoll (2005) wrote that only four Asian countries, India, Hong Kong, South 
Korea, and Japan have their own art therapy associations. Only two Asian countries, 
Hong Kong and South Korea, have established post-graduate art therapy training 
programs. However, none of the four Asian countries had a nationally accredited 
licensure system. Art therapists in Asian countries continue to fight for government 
recognition and legitimate support to establish university-based art therapy education 
programs (Debra, Siu, & Jordan, 2012; Stoll, 2005).  
 
12 
 
Art-based Assessments with Multicultural Relevance 
With growing diversification and the broadening scope of art therapy, “it is ever 
more important for art therapists to ensure responsible and ethical treatment approaches 
and assessments” (Betts, 2013, p.98). Many psychologists and therapists have developed 
various art-based assessments in the United States, but there are few specifically designed 
to be used across cultural groups. Interestingly enough, however, many of these 
assessments have been successfully utilized with a variety of populations and adapted for 
cultural sensitivity and relevance (Betts, 2013; Hocoy, 2002).  
Cultural sensitivity and relevance of art-based assessments. Kaiser and 
Deaver (2009) considered the Bird’s Nest Drawings assessment a cross-culturally valid 
assessment. They assessed five studies that used the Bird’s Nest Drawing assessment to 
examine attachment in different conditions, ranging from mothers (Kaiser, 1996), 
children (Trewartha, 2004), women with high-risk pregnancies (Overbeck, 2002), clients 
with substance abuse disorder (Kaiser & Deaver, 2003), and foster children (Hyler, 2002). 
Though Kaiser and Deaver (2009) admitted more peer-reviewed research is required to 
establish the validity of the Bird’s Nest Drawing with diverse cultural populations, they 
suggested that the assessment appeared to be a culturally reliable assessment for diverse 
populations.   
 Arrington and Yorgin (2001) and Jung and Kim (2010) found the Favorite Kind 
of Day assessment (Manning, 1987) a culturally sensitive and relevant assessment. Using 
the drawing-based assessment, Arrington and Yorgin (2001) measured the psychological 
status of orphaned and homeless children in Kiev, Ukraine. Jung and Kim (2010) 
conducted a normative study of the same assessment tool using a Korean sample of 107 
13 
 
female and 46 male undergraduate students. The two studies produced favorable results 
that demonstrated validity in measuring depression; this supported the cultural relevance 
and utility of the assessment.  
Williams, French, Picthall-French, and Flagg-Williams (2011) conducted a 
review of the literature on projective assessments seeking cross-culturally relevant and 
valid assessments. Of the assessments they reviewed, they suggested that the Human 
Figure Drawing tasks demonstrated the most cross-cultural adaptability and versatility. 
The authors claimed that the universality of the human figure is reliable for use by people 
of any age, gender, and cultural background.  
Import of research on cultural competency of art therapy assessment. While 
cultural sensitivity is paramount, a culturally blind art-based assessment can lead to the 
mistreatment of clients in different cultural settings. Since the process of assessment is 
always the first step to treating clients, it is necessary to scrutinize the set of assumptions 
and reliability of art-based assessments that may or may be not applicable to other 
cultures. Many art-based projective assessments have been successfully utilized with 
diverse populations. Thanks to the universal application of art, these assessments appear 
to be adaptable for cultural sensitivity and relevance. However, nearly none of these 
assessments were empirically supported using standardized outcome measures, even if 
there were reasonable “observations” regarding universal elements in the drawings across 
diverse populations (Arrington, 2005; Betts, 2013; Calisch, 2003; Hocoy, 2002). Betts 
(2013), Gantt and Tabone (1998), and Hocoy (2002) claim that the field of art therapy 
lacked scientific normative data to support subjective observations; they argue most of 
studies of art therapy assessments contained major methodological weakness and 
14 
 
limitations. Betts (2013) and Hocoy (2002) suggested that much more empirical research 
of standardized measurements is required in art therapy to establish the effectiveness and 
reliability of art-based assessments in cross-cultural settings (Betts, 2013; Hocoy, 2002).  
Formal Elements Art Therapy Scale and Person Picking an Apple from a Tree 
 The Person Picking an Apple from a Tree. The art therapy assessment, Person 
Picking an Apple from a Tree (PPAT), is an art-based assessment designed to identify a 
client’s mental health symptoms and progress. The PPAT assessment was first described 
by Viktor Lowenfeld (1939, 1947) in a study he conducted on children’s use of space in 
art. However, he did not discuss his reason for using it, and little information had been 
written about the drawing assessment. The PPAT drawing was simply utilized as a 
projective assessment, and its interpretation largely depended on the individual clinician’s 
intuition and experiences.  
It was Gantt and Tabone (1998) who identified the potential of the PPAT drawing 
as a reliable art-based assessment to evaluate a client’s clinical state as well as response 
to treatment. They were particularly interested in developing a standardized art-based 
assessment that was useful to both clinicians and art therapists, since they had identified 
methodological weaknesses in most art-based and projective assessments as well as 
analyzing problematic assumptions used by clinicians in their approaches to assessment 
results.  
Gantt and Tabone (1998) found that the PPAT drawing had four advantages that 
lent themselves to becoming a standardized art-based assessment. These advantages 
included its (a) applicability to any patient regardless of their degree of artistic ability, 
intelligence, or interest; (b) simple and direct instructions; (c) constancy of content to 
15 
 
allow for obtaining valid and useful information by comparing productions of different 
clients and of the same clients at different times; (d) emphasis on objective structure and 
form, rather than on subjective content and symbolism (Gantt & Tabone, 1998). The 
authors standardized materials and instructions with several other researchers (Gantt & 
Tabone, 1998; Munley, 2002; Rockwell & Dunham, 2006; Williams et al., 1996) and 
began studying the PPAT systemically to establish empirical support for the clinical 
utility of the PPAT assessment.  
The Formal Elements Art Therapy Scales. The Formal Elements Art Therapy 
Scale (FEATS) is a measurement system that applies interval rating scales to formal art 
elements in two dimensional arts, in particular, the Person Picking an Apple from a Tree 
(PPAT) assessment (Gantt & Tabone, 1998). The instrument was first developed by Gantt 
and Tabone (1998) with the intent of establishing a scientifically valid measurement to 
demonstrate correlation between psychiatric symptoms and globally objective elements 
in art. The FEATS evaluation of the PPAT consists of 14 individual scales that rate the 
formal elements of a drawing that demonstrate graphic equivalents to psychiatric 
symptoms.  
 What was groundbreaking about Gantt’s and Tabone’s analysis was the 
identification of objective indicators. Until that time, clinicians had difficulty explaining 
diagnostic clues they found in art, unless they indebted themselves to the psychoanalytic 
theories of Sigmund Freud and Carl Jung who emphasized interpretation of symbolic 
meaning (Groth-Marnat, 1990). The paradigm in the 20th century relied upon a 
“dictionary approach” (Gantt & Tabone, 1998, p. 53) to understanding images by 
decoding symbolic meaning. As belief in psychoanalytic theory decreased, however, this 
16 
 
approach was criticized for lack of scientific merit. Clinicians were challenged to respond 
to criticisms about their methods and assumption.  
Gantt and Tabone (1998), therefore, decided to base the FEATS instrument on 
pattern-matching methodology to help clinicians accurately distinguish drawings 
containing graphic indicators of symptoms that correlated to four clinical diagnoses: 
schizophrenia, bipolar disorder, major depression, and organic mental disorders including 
delirium, dementia, amnesia, and other cognitive disorders. Adopting the pattern-
 matching method, Gantt and Tabone (1998) developed the 14 individual scales of the 
FEATS using three sources: (a) their own clinical experience and observation, (b) the art 
therapy and psychology literature on the art and projective drawings of psychiatric 
patients, and (c) the four psychiatric symptoms from the Diagnostic and Statistical 
Manual III (American Psychiatric Association, 1994). Five-point Likert scales were then 
utilized on each of the 14 scales of the FEATS to rate the formal elements of the image. 
The 14 individual scales were intended to measure global attributes common to art in 
general and included the formal elements identified below. 
Prominence of color. Prominence of color, the first scale of the FEATS, 
measures how much color a person uses in the entire picture (Gantt & Tabone, 1998). For 
example, this scale identifies whether color is used only to outline a form or is used 
appropriately to fill in the form and background. In general, it is believed that color is 
related to affect. Multiple studies have reported that emotion is positively correlated to 
color and that people with mood disorders employ either little color or a great deal of 
color (Gantt & Tabone, 1998; Groth-Marnat, 1990; Wadeson, 1980).  
17 
 
Color fit. Color fit, the second scale of the FEATS, examines whether a person 
uses colors in the PPAT that are appropriate to the object depicted (Gantt & Tabone, 
1998). Extraordinary use of color is related to illogical thinking or difficulty in 
integrating affective experience (Amos, 1982; Robertson, 1952). However, there has been 
little theoretical speculation regarding color fit as related to illogical thinking (Gantt & 
Tabone, 1998).  
Implied energy. Implied energy assesses the amount of energy used to make the 
drawing. In other words, this scale measures how much energy and apparent effort a 
person takes to complete the entire drawing. Gantt and Tabone (1998) reported that in 
their clinical experiences they had seen what appeared to be a strong relationship between 
the amounts of energy used in drawings and the amount of manic activity demonstrated 
by their patients.   
Space. This scale examines the amount of space utilized for the PPAT drawing. 
To clarify, the scale measures what percentage of the paper a person uses in the entire 
drawings (Gantt & Tabone, 1998). Gantt (2004) reported that depressed patients tend to 
draw smaller figures using less space, while people with manic disorder tend to draw 
bigger figures using more space.   
Integration. Integration measures the degree to which the items in the picture are 
balanced into a cohesive whole. This scale is essential to the PPAT assessment since the 
PPAT assumes that specific elements, such as the apple, the tree, and the person, have a 
relation to one another (Gantt & Tabone, 1998). Lack of integration or chaotic 
organization of art is more likely related to schizophrenia (Russell-Lacy, Robinson, 
Benson, & Cranage, 1979). 
18 
 
Logic. This scale attempts to distinguish illogical responses to the request for  
the drawing. Gantt and Tabone (1998) claim that making a rating on this scale is not an 
easy task; raters often have difficulty differentiating illogical responses from funny or 
satirical ones. Several studies (Arieti, 1976; Amos, 1982; Groth-Marnat, 1990) have 
shown that lack of logic in drawings is related to the impairment in abstract thinking.  
Realism. Realism, the seventh scale, measures the degree to which items are 
realistically drawn. This scale attempts to assess whether the items in the picture, such as 
tree, person, and apple, are recognizable and realistically drawn (Gantt & Tabone, 1998). 
Groth-Marnat (1990) and Gantt and Tabone (1998) reported that unrecognizable items in 
drawings were related to Alzheimer’s disease and grandiose ideology.   
Problem-solving. The eighth scale, problem-solving, is primarily concerned with 
whether and how the drawn person gets the apple out of the tree. This scale measures 
whether the person can get the apple in a relatively reasonable fashion or not. Gantt and 
Tabone (1998) asserted that lack of problem-solving skills in the PPAT drawings is 
correlated to manic disorder. 
Developmental level. This scale attempts to measure a person’s development 
level by comparing the drawing with Lowenfeld’s (Lowenfeld & Brittain, 1987) 
developmental stages of creative growth in children. In other words, this scale assesses 
whether the drawing is an artistically unsophisticated drawing or an “adult” drawing 
(Gantt & Tabone, 1998). This scale has been controversial because developmental level is 
influenced by education, art training, and social-economic levels (Gantt, 2004). 
Details of object and environment. This scale measures the relative amount of 
detail in the PPAT drawings. Gantt and Tabone (1998) wrote that average non-patients are 
19 
 
able to provide the essential details of the subject matter, including the person and tree. 
As in the case with the third scale, implied energy, the authors argued that a low score on 
this scale is associated with major depression and a high score with mania (Gantt & 
Tabone, 1998).  
Line quality. This scale, line quality, attempts to assess the amount of control a 
person seems to have over the variety of lines in the picture. In other words, a person who 
is in control of both the medium and their hands can make lines of different weights and 
lengths (Gantt & Tabone, 1998). There have been several studies on relation of line 
quality with psychiatric symptoms. According to Wilkinson and Schnadt (1968), patients 
with paranoid schizophrenia tended to produce line quality that was heavier than those 
created by non-patients. Moreover, Vernier, Stafford, and Krugman (1958) reported the 
drawings of patients with organic disorders included an abundance of sketchy and broken 
lines.   
Person. This scale attempts to assess whether a person is able to draw the person 
in the PPAT to look like a three-dimensional person rather than a stick figure. Gantt and 
Tabone (1998) argued that a person with a distorted sense of self is more likely to draw a 
human figure which is severely distorted or fragmented. Evans (1984) also demonstrated 
that patients with schizophrenia tended to draw the human figure with disproportionate 
body parts. 
Rotation. This scale assesses the amount of tilt that the person and/or the tree 
presents. Gantt and Tabone (1998) argued that the tree and the person in PPAT drawing 
would be reasonably upright. This scale is designed to identify variables associated with 
20 
 
brain-damage or emotional disturbance. Gantt and Tabone (1998) reported patients with 
brain-damage often drew figures which were extremely tilted.  
Perseveration. The last scale, perseveration, assesses whether a person engaged 
in extremely repetitive graphic activity. In other words, the scale of perseveration is 
concerned with the repetition of a single graphic element or motor act, such as making 
repeated loops for apples (Gantt & Tabone, 1998). Cuneo and Welsh (1992) indicated that 
perseveration was associated with psychiatric disorders such as Alzheimer’s, Autism, and 
learning disabilities. 
Reliability and Validity of the FEATS with the PPAT in Clinical Settings 
To develop a scientific art-based assessment, the originators conducted several 
pilot studies to determine if the drawing of “a person picking an apple from a tree” (PPAT) 
as an assessment tool carried sufficient diagnostic information. In their pilot studies, 
Gantt and Tabone (1998) collected PPAT drawings from patients with one of six 
categories of psychiatric disorders: manic disorder, depression disorder, schizophrenia, 
intellectual disability, organic disorder, and impulse control disorder. They asked 
professional clinicians to classify the drawings into diagnostic categories without any 
knowledge about the person who drew the picture. The results confirmed that based on 
the drawings alone, most of the evaluators made correct decisions more often than not 
(Williams, Agell, Gantt, and, Goodman, 1996).  
After Gantt and Tabone verified the validity of the PPAT assessment, they 
continued to conduct pilot studies on the reliability and validity of the FEATS as an art 
therapy assessment tool (Williams, Agell, Gantt, and Goodman, 1996). To establish the 
reliability of the FEATS in their studies, they engaged three different groups of three 
21 
 
raters. The first group consisted of art therapists, another group was comprised of social 
workers, and the final group was comprised of recreation therapy students. Each group 
was trained to use the FEATS scales. Gantt and Tabone gave each of the groups the same 
ten PPAT pictures to rate. The results demonstrated a significant inter-rater reliability 
of .90 and above, for 13 of 14 of the scales, except the scale of rotation (Gantt, 1990).  
Once Gantt and Tabone (1998) established the high inter-rater reliability of the 
FEATS, they conducted pilot studies to determine if the FEATS instrument was valid and 
actually measured what they designed it to measure. They collected drawings, with 
permission, from patients who met strict criteria for one of two psychiatric disorder 
categories, Axis I and Axis II Disorders in the DSM-III (American Psychiatric 
Association, 1980). Based on the psychiatric disorder categories the patients met, Gantt 
and Tabone assigned the pictures to an experimental group or a control group. Using an 
analysis of variance (ANOVA), they found that 10 of the 14 scales distinguished between 
two or more groups with 85% accuracy; the average variances between groups were 
significantly greater (F=64.0456) at a significance level of p≤ .05 (Gantt, 1990, 1993).  
Although the studies demonstrated the reliability and validity of the FEATS with 
the PPAT, the sample size from the studies were too small to generalize reliability and 
validity. Thus, replicating the studies using larger samples to establish empirical support 
was necessary. Munley (2002) conducted a study to verify the original findings 
supporting the utility of the FEATS with the PPAT instrument. In her study, Munley 
(2002) wanted to explore whether children with AD/HD responded differently to the 
PPAT assessment as measured by the FEATS, compared to children without learning or 
behavioral disorders.  
22 
 
In her descriptive matched-pair experiment, Munley (2002) selected two separate 
groups, a case group and a control group. The case group included five male Caucasian 
children aged 5 to 10 years old who were diagnosed with AD/HD and comorbidity for 
possible Conduct and Adjustment Disorder or Depression and Adjustment Disorder. The 
control group included five male Caucasian children, ages 5 to 12, without known 
learning or behavioral disorders. Munley (2002) hypothesized that the case group, the 
children with AD/HD, would rate differently on the scales of the FEATS than the control 
group, the children without behavioral disorders or learning disabilities. In addition, she 
hypothesized that the PPAT drawing responses measured with the FEATS which were 
obtained from the children with AD/HD would have similarities to others within their 
group, but would be different from those of the control group.  
Munley’s study (2002) supported the two hypotheses, demonstrating that 
children with AD/HD scored differently on the FEATS, and that their PPAT drawing 
responses had similarities to others within their group but had differences from those 
from the control group. Using an analysis of variance (AVOVA) and logistic regression 
analysis, Munley (2002) demonstrated that the between-group variance was significantly 
greater, F=62.0383, compared to within group variance at a significance level of p≤ .05. 
In addition, she reported that the FEATS with the PPAT assessment distinguished 
between the two groups with 97% accuracy, and that interrater reliability correlations 
were strong for both groups at the significance level of p≤ .05, with no value less 
than .638 for the control group and none less than .670 for the case group. As a result, 
Munley’s study (2002) helped support the original findings obtained by Gantt and 
Tabone’s study (1998). 
23 
 
Rockwell and Dunham (2006) also supported the validity and reliability of the 
FEATS with the PPAT in a clinical setting. The authors assessed the use of the FEATS 
with a population of persons with Substance Use Disorders. Adopting a matched-pair 
experiment, they established two separate groups, an experimental and a control group. 
The experimental group was comprised of 20 adults with a DSM-IV diagnosis of 
Substance Use Disorder; the control group included 20 adults with no psychiatric 
diagnoses.  
Utilizing an analysis of variance (AVOVA), Rockwell and Dunham (2006) found 
that 12 scales of the FEATS were able to distinguish between the members of the two 
different groups with an average 85% accuracy. In particular, they emphasized that three 
individual scales of the FEATS, Realism, Developmental Level, and Person, were 
particularly different between the two groups. The experimental group obtained 
significantly lower scores on those scales. Also, the study demonstrated that the interrater 
reliability correlation was strong for both groups at the significance level of p≤ .05. As a 
result, Rockwell and Dunham (2006) supported the original findings identified by Gantt 
and Tabone’s study (1998), which argued the FEATS instrument with the PPAT drawing 
was a reliable and valid assessment in clinical practice to screen for people with mental 
illness. 
Along with the two replicated studies described above, other scholars and 
researchers have reported the utility of the FEATS in coordination with the PPAT 
drawings. Gantt (2001) demonstrated in her study that the FEATS with the PPAT was 
able to identify symptoms associated with schizophrenia, major depression, bipolar 
disorder, and cognitive disorders. Gussak (2004, 2006, 2007) reported successful use of 
24 
 
the FEATS with the PPAT to identify the degree of severity over time for symptoms 
related to depression. White, Wallace, and Huffman (2014) demonstrated that PPAT 
drawings measured by the FEATS successfully identify disordered thinking among 
students with emotional and behavior disorders.  
The Normative Study of the FEATS with the PPAT 
Since the FEATS and the PPAT were standardized by Gantt and Tabone (1998), 
there have been replicated studies empirically supporting the reliability and validity of the 
FEATS instrument with the PPAT drawings in clinical settings. However, little study has 
occurred of large-scale normative data. Large-scale normative studies are essential to 
empirically validate assumptions about non-patient clients’ projective drawings (Gantt & 
Tabone, 1998; Williams Agell, Gantt, & Goodman, 1996). In other words, baseline must 
be established so that the FEATS and the PPAT can be used as a standard tool in a variety 
of counseling and research settings. Although Gantt and Tabone (1998) discussed the 
need for normative data and described the patterns they observed in the drawings of their 
non-patient groups, they did not indicate normative data beyond what was observed in the 
drawings of non-patient group. 
It was Bucciarelli (2011) who initially attempted to support the development of 
large-scale normative studies of the FEATS with the PPAT. She recruited 100 non-patient 
participants using a convenience sample method. The non-patient participants were 
comprised of 46 males and 54 females with a variety of ethnicities; 60 participants 
identified themselves as White, 13 as Hispanic, 11 as Black, three as Biracial, and 13 as 
other. She also investigated the influence of gender and ethnicity on the assessment  
 
25 
 
results to establish normative data empirically supporting reliability and validity of the 
FEATS with the PPAT.  
Bucciarelli’s (2011) study demonstrated strong inter-rater reliability at the 
significance level of p≤ .05 on all of the FEATS scales except one scale, Perseveration. 
This result was consistent with previous research that reported strong interrater reliability 
on 12 of the 14 scales (Gantt & Tabone, 1998; Munley, 2002; Rockwell & Dunham, 
2006). The result supported again the reliability of the FEATS with the PPAT. 
 More importantly, Bucciarelli’s study demonstrated statistically normative data 
that supported Gantt and Tabone’s (1998) predictions of formal elements with normative 
sample, with the exception of the Developmental level. Gantt and Tabone (1998) 
predicted a non-client or non-patient drawings would score, on average, 4.0 or higher on 
the Likert scales of 1 to 5 for most of the FEATS scales. Gantt (1998) hypothesized non-
 patient PPAT drawings would have (a) appropriate color use; (b) logical and balanced 
composition; (c) reasonable amount of details, color, and energy; (d) realistic and 
reasonable depiction of a person; (e) developmental features common to adolescent 
drawings; and (f) depiction of a practical way for getting an apple out of a tree.  
Bucciarelli’s statistical normative results confirmed nearly all of the predictions 
about non-patient drawings; seven scales of the FEAT corresponded with the predictions 
of a score above 4.0. However, her study also found that half of the scales for the non-
 patient drawings indicated, on average, a score lower than a 4.0. Finally, Bucciarelli’s 
study (2011) demonstrated there were significant differences on some scales of the 
FEATS in terms of gender and ethnicity. For example, male participants scored 
significantly lower than female participants on scales of Space, Integration, and Line 
26 
 
Quality. Furthermore, Bucciarelli (2011) found a significant difference on the 
Perseveration scale between the drawings of White participants and those of Black 
participants at the p≤ .05 significance level. 
As a result, Bucciarelli’s study (2011) was the first to provide empirical data to 
establish normative baseline for the utility of the FEATS instrument with the PPAT 
assessment. Her data confirmed nearly all of the predictions about non-patient drawings, 
and served as a milestone to facilitate future normative studies. Her study, however, also 
indicated that there were significant differences on some scales of the FEATS with 
respect to participants’ gender and race. This result brought up new questions about 
cross-cultural reliability and validity of the FEATS with the PPAT.   
The Multicultural Reliability and Validity of the FEATS with the PPAT 
Although Bucciarelli’s study (2011) contributed to the development of empirical 
normative data for the FEATS with the PPAT, one of limitations in her study was 
weakness of methodology associated with convenient samples from a particular 
geographic location, age, and culture. More importantly, the sample for this study was 
predominantly Caucasian American. This weakness was not unique to her study. In fact, a 
serious limitation of the FEATS is that the original samples obtained from the originators 
were predominantly Caucasian American. Indeed, much of the subsequent research has 
been conducted with similar Caucasian American dominated sample (Bucciarelli, 2011; 
Munley, 2002; Rockwell & Dunham, 2006). To validate and generalize all the previous 
affirmative results of the FEATS, it seems necessary to conduct a similar normative study 
with a variety of cultures and ethnic groups.   
To address the limitation, Nan and Hinz (2012) aimed to scrutinize the reliability 
27 
 
of the FEATS with an Asian population. To establish normative data the reliability of the 
FEATS with this population, their study included a sample of 51 non-patient Chinese 
individuals living in Hong Kong, conveniently selected from two local colleges, a high 
school, and a local Christian church. Nan and Hinz collected the drawing data from the 
51 Chinese participants within a period of 2 to 3 weeks. To measure interrater reliability 
and reliability of the FEATS in a cross-cultural context, they used Cronbach’s alpha and 
Person’s r correlation.  
Nan and Hinz (2012) found that the Cronbach’s coefficient alpha of the 14 
FEATS scales was .870; this proved that the FEATS was a reliable instrument with strong 
internal consistency in the cultural context of an Asian population. Also, they found 
strong inter-rater reliability correlations in the majority of individual FEATS scales at the 
significance level of p≤ .05, with only two exceptions, the Line Quality and Rotation 
Scales. These results supported strong interrater reliability of the FEATS in an Asian 
population.  
Most importantly, the normative data gained in their study indicated that the 
majority of the mean scores on the 14 FEATS scales were nearly consistent with the 
originators’ predictions about non-patient drawings and with normative data (Gantt & 
Tabone, 1998; Bucciarelli, 2011). The mean scores on 13 scales of the 14 FEATS scales 
in Nan and Hinz’s study fell in line with mean scores in Bucciarelli’s study (2011), except 
on the scale of Prominence of Color. 
The similarity of normative data between Nan and Hinz’s study and Bucciarelli’s 
study supported cross-cultural utility of the FEATS instrument with the PPAT drawings, 
as the FEATS originators suggested. However, the Nan’s and Hinz’s study (2011) also 
28 
 
found a notable difference on the overall variability and standard deviations of the 14 
FEATS scales between the two studies. The overall variability and standard deviation for 
Asian sample was higher than that of American sample in Bucciarelli’s (2011) study, 
particularly regarding the Color Fit, Logic, and Integration Scales (Nan & Hinz, 2012). 
Nan and Hinz suggested that this variability in drawings and ratings may indicate a 
somewhat greater disparity in drawing style or ratings in the Asian sample on these three 
variables. 
Oh (2013) conducted a pilot study to examine cross cultural reliability of the 
FEATS with the PPAT. Oh recruited 51 undergraduate college students enrolled in a mid-
 sized university in the Midwest of the United States. The 51 participants consisted of 8 
Asian students, 7 Hispanic students, and 36 American students. He collected the drawing 
data from the three groups of different cultural backgrounds within a period of 1 to 2 
weeks. For rating the PPAT drawings, he had one group of a graduate student from Asian 
cultural background and one group of a graduate student from American cultural 
background. To measure inter-rater-reliability of the FEATS, he used a Pearson 
correlation. In addition, he utilized ANOVA test to identify statistical differences in the 
scores on the 14 scales of the FEATS among the three different groups.  
Oh’s pilot study (2013) demonstrated strong inter-rater reliability correlations on 
12 of 14 FEATS scales. This result supported that the FEATS was a reliable instrument in 
a cross-cultural context. In addition, the normative statistics gained from the three groups 
in his study indicated that the majority of the mean scores on the 14 FEATS scales were 
nearly consistent with the originators’ predictions about non-patient drawings and with 
normative statistics (Bucciarelli, 2011; Gantt & Tabone, 1998; Nan & Hinz, 2012). The 
29 
 
similarity of normative statistics between Oh’s study (2013) and the previous studies 
supported cross-cultural utility of the FEATS instrument with the PPAT drawings. 
More importantly, Oh (2013) found no statistically significant difference on the 
scores on the 13 scales of 14 FEATS scales among three different cultural groups at the 
significance level of p≤ .05, with only one exception, the Integration Scale. This result 
supported the assumption of the usefulness of the FEATS with the PPAT drawings across 
cultural contexts. However, he also found a notable statistical difference on the scale of 
Integration between Asian group and American group. He suggested that the statistical 
difference on the scale may result from two different worldviews that each of the cultural 
groups is based on, individualism and collectivism.  
Summary and Research Hypotheses  
Nan’s and Hinz’s study (2012) provided valuable normative data to establish a 
baseline for cross-cultural utility of the FEATS instrument with the PPAT drawings. Their 
study, however, had a serious limitation in that the 51 Chinese participants living in Hong 
Kong were unlikely to be representative of all Asians or Asian cultures (Nan & Hinz, 
2012). In addition, the relatively small sample size of 51 participants was too small to 
generalize the results to other populations (Kaplan & Saccuzzo, 2005). According to 
Fraenkel and Wallen (2006), a research study needs a sample size of at least 100 people 
to reach significance.  
More importantly, Nan’s and Hinz’s study (2012) was the only research 
conducted to establish normative data on reliability and validity of the FEATS with the 
PPAT assessment within a non-American culture. Normative studies are necessary to 
generalize cross-cultural utility of the FEATS with the PPAT. In addition, only one cross-
30 
 
cultural study (Oh, 2013) directly compared normative statistics between more than two 
different cultural groups. Nan’s and Hinz’s study (2012) indirectly compared results to 
the previous normative data obtained by Bucciarelli (2011) with a multicultural focus. Oh 
(2013) completed the only cross-cultural study but with a small sample.    
Therefore, it was my intention, in this study, to re-examine cross-cultural utility 
of the FEATS with the PPAT by directly comparing two samples of different cultural 
groups, American and Asian populations. To address the limitations of Nan and Hinz’s 
study (2012), this study engaged a more diverse ethnic Asian sample group including 
Chinese, Japanese, and Koreans. Additionally it included a larger sample size of at least 
100 participants, including the Americans and Asian participants. The second purpose of 
this study was to contribute to the growing body of normative data for two different 
culture groups, so that a baseline could be established to use the assessment as a standard 
tool in a variety of cultural and research settings. My research hypotheses were:  
1. There is cross-cultural reliability of the assessment instrument, the FEATS, 
between Asian and American raters at a university in the United States. 
2. Normative statistics will be obtained in this study that are consistent with the 
originators’ (Gantt & Tabone, 1998) predictions about non-patient drawings 
and with normative statistics (Bucciarelli, 2011; Nan and Hinz, 2012). 
3. There is no difference in the scores of the two college student groups, 
Americans and Asians, on the majority of scales of the assessment 
instrument’s measurement of various aspects of psychological health. 
 
 
31 
 
CHAPTER 2 
METHOD 
 The purposes of this quantitative study were first, to test whether the Formal 
Elements Art Therapy Scales (FEATS) instrument in coordination with the Person 
Picking an Apple from a Tree Drawing (PPAT) (Gantt & Tabone, 1998) would be a 
reliable art therapy assessment in a cross-cultural context, and second to establish 
empirical support for the development of normative data on the cross-cultural application 
of the FEATS scales for two different cultural groups: American and Asian college 
students. The focus of the data analysis was on the influence of cultural backgrounds on 
the responses to the art therapy assessment, the consistency between the results of this 
study and those of previous studies conducted for the same purpose, and the evaluation of 
the cross-cultural utility of the art therapy assessment. A stratified quantitative design was 
used to compare the responses of international Asian college students with American 
college students. To select the sample of the two distinct cultural groups, students served 
by the Office of International Education and the psychology counseling and sociology 
departments at a mid-sized public university in the United States participated in this study.    
Participants 
The sample. For this study, the population was targeted on American 
undergraduate students who were raised in Western cultures or Asian undergraduate 
students who were raised in Asian cultures, both of whom attended a mid-sized public 
university in the Midwestern United States. American undergraduate students were 
natively-born and raised in the USA, with the age range of 18 to 28, and enrolled in the 
2015 spring semester. Participants who were represented Asian cultures were 
32 
 
international Asian undergraduate students, including Asian exchange students, who were 
born and raised in Asia, aged 18 to 28, and enrolled in the 2015 spring semester. All of 
the 57 American participants identified their cultural background as American or Western 
Europe and gave their country of birth as the United States. All of the 57 Asian 
participants identified their cultural background as Asian but originated from three 
different countries; 28 participants identified their country of birth as South Korea, 15 as 
China, and 14 as Japan. The total sample of participants from both distinct cultural 
groups consisted of 64 female participants and 50 male participants. Table 1 demonstrates 
summary of characteristics of participants from both cultural groups. To establish 
normative results, I excluded any participants, Asian or American, from the study if they 
had a self-reported psychiatric diagnosis according to the DSM-IV-TR (American 
Psychiatric Association, 2000) and DSM 5 (American Psychiatric Association, 2013), as 
both manuals were in use during 2013 and 2014.  
 
 
 
 
 
 
 
 
 
 
33 
 
Table 1 
Summary of Characteristics of Participants 
Characteristic Asian Participants (N=57) American Participants (N=57) 
Gender   
  Male 20 30 
  Female 37 27 
Age   
  18 to 21 39 34 
  21 to 24 15 18 
  25 to 28 3 5 
Country of Birth   
  Korea 28 0 
  Japan 14 0 
  China 15 0 
  United States 0 57 
 
 
 
 
 
 
 
 
 
 
 
 
34 
 
Selection methods. This study utilized a random sampling method to select 60 
American participants from a convenience sample of 100 American participants. 
Originally I planned to conveniently select 100 American participants from the research 
pool of the university’s psychology department. However, since the research pool at the 
university was unavailable for this study, alternative options were utilized to select a 
convenience sample of 100 American participants. The primary researcher personally 
contacted professors and instructors of undergraduate sociology, art therapy, and mental 
health counseling classes to conveniently select 78 American participants. In addition, I 
solicited an additional 22 American participants from the university library by randomly 
asking undergraduate students studying in the library to voluntarily participate in this 
research.  
Originally, the primarily researcher planned to conveniently select 60 Asian 
students from the list of international Asian students enrolled in this university, with the 
help of the Office of International Education. However, only 57 Asian students were 
selected from the list, with the help of student leaders of Asian student communities, 
since three students were unavailable due to time conflicts with their class schedules. 
Both of the samples, 100 American and 57 Asian participants, were selected within a 
week, and seven American participants who self-reported a DSM IV TR or DSM 5 
diagnosis on the questionnaire were excluded from the study. None of the 57 Asian 
participants self-reported any type of mental illness or symptoms on the questionnaire.  
Disproportionate stratified sampling was used to identify up to 114 participants, 
with equal numbers from each group selected. Fifty seven American students were 
randomly selected from the initial sample of 93 American participants and 57 Asian 
35 
 
international students were conveniently selected from the list of Asian students. 
Research Design and Instrument 
 Logistics. This study used a descriptive and comparative quantitative design. A 
stratified quantitative design was used to compare the responses of American students to 
the PPAT scored using the FEATS to those of international Asian students. The two 
different cultural groups followed the same procedures and direction described below. 
Data from each of the two different cultural groups was collected in separate sessions 
within one week. 
The PPAT. The Person Picking an Apple from a Tree (PPAT) (Lowenfeld, 1939) 
(Appendix A) was used as an assessment. The art therapy assessment, “person picking an 
apple from a tree,” was first described by Lowenfeld (1939) as a projective drawing 
assessment to determine diagnostic symptoms. The drawing of “person picking an apple 
from a tree” was considered an applicable and direct method to solicit useful and valid 
information regarding an individuals’ psychiatric condition (Gantt & Tabone, 1998; 
Munley, 2002; Rockwell & Dunham, 2006; Williams et al., 1996).  
The FEATS. This study utilized the Formal Elements Art Therapy Scales 
(FEATS) (Appendix B) to rate and score drawings of the PPAT from participants (Gantt 
& Tabone, 1998). The FEATS consists of 14 Likert Scales. Each of the scales assigns a 
numerical value between one and five to each of 14 formal art elements observable in 
drawings: the prominence of color, color fit, implied energy, space, integration, logic, 
realism, problem-solving, developmental level, details of objects and environment, line 
quality, person, rotation, and perseveration (Gantt & Tabone, 1998).  
Gantt and Tabone developed the FEATS manual in 1998 as an objective rating 
36 
 
system for art-based assessments including the PPAT in order to establish global 
characteristics of art which provide information regarding diagnosis and clinical states. 
Inter-rater reliability gained through more than 10 years of study ranges from .90 to .94 
(Gantt & Tabone, 1998; Munley, 2002; Rockwell & Dunham, 2006; Williams et al., 
1996). Using an analysis of variance (ANOVA), Gantt (1998) and Munley (2002) 
identified validity of the scale that distinguished variances between two or more groups.  
Demographic questionnaire. This study utilized a questionnaire (Appendix C) 
to collect basic demographic information on gender, ethnicity, cultural background, age, 
and history of mental health diagnoses of each participant. Participants self-reported any 
known mental health diagnoses and indicated if they were currently using psychotropic 
medication. To collect and establish normative data, which was a purpose of this study, 
the PPAT drawings of the participants who self-reported a DSM IV TR and DSM 5 
diagnosis on the questionnaire were dropped from the analysis. Finally, participants self-
 reported current level of stress on Likert scale in the questionnaire to identify potential 
factors that might have influenced the results of this study. 
Procedure 
IRB approval and Informed consent. Approval from Institutional Review 
Board Committee of a mid-sized university in the Midwest of the United States was 
obtained before beginning this study (Appendix D). Prior to the drawing task, I read an 
informed consent (Appendix E) agreement to participants and asked them to sign an 
informed consent agreement to participate in the research. Each participant completed the 
demographic questionnaire. 
 
37 
 
Administration. I utilized the standardized procedure developed by Gantt and 
Tabone (1998) to administer and rate the participants’ drawings in this study. I asked 
participants to draw a picture of “a person picking an apple from a tree” (Appendix A) on 
a piece of 18" x 12" white drawing paper using 12 Mr. Sketch scented watercolor markers 
(purple, pink, magenta, dark blue, light blue, dark green, light green, black, brown, 
yellow, orange, and red). No other instructions were provided.  
Raters and the rating procedure. Raters scored the drawings using the FEATS 
(Appendix B) rating sheets (Appendix F). The primary investigator did not rate or score 
the PPAT drawings to limit investigator bias. I originally planned to recruit one mixed 
rater group with a total of five American faculty and graduate students and one mixed 
rater group of five Asian faculty and graduate students from the departments of 
psychology and counseling, all of whom were blind to the hypotheses of the study. 
However, due to unavailability of the faculty members from the departments at this 
university, or their service on my committee, I recruited an American professional 
counselor and a Korean faculty member working in South Korea as alternative raters for 
each groups of raters.    
As a result, the American rater group consisted of four American graduate 
students from the Department of Psychology and one American professional counselor at 
my internship site. The Asian rater group consisted of four Asian graduate students from 
the Department of Psychology and one Asian faculty member from a Department of Art 
Therapy in South Korea. 
The primary researcher conducted a one-hour training session to ensure that 
raters understood the rating system before they performed any ratings. All graduate 
38 
 
student raters from each group, four American graduate students and four Asian graduate 
students, were trained and scored the PPAT drawings as a group on the same day at the 
same time. The remaining rater from each group, the American professional counselor 
and the Asian faculty member, were trained and scored the PPAT drawings during 
separate sessions at separate times. All training sessions but one were conducted in-
 person; the training session with the Korean faculty member was via Skype. The total 
average rating time of the PPAT drawings for each rater was one hour and thirty minutes.  
Because there were five people in each group of raters and two sets of 57 PPAT 
drawings from each cultural group to be rated, each rater was asked to rate two sets of 11 
or 12 of the PPAT drawings, one set from each cultural group. As a result, each rater 
rated 22 or 24 of the PPAT drawings, and each PPAT drawing was rated once by each 
group of raters, resulting in scores for each drawing from both an American and Asian 
rater. Therefore, for each drawing, two sets of 14 numeric variables were obtained, 
representing the scores for each of the 14 FEATS scales.  
Data Analysis 
 Data was analyzed using two different statistical tests and descriptive statistics to 
test each of the three research hypotheses: (a) reliability of the FEATS assessment in a 
cross-cultural context, (b) consistency between responses of the two groups in this study 
and those obtained by previous normative studies, (c) consistency in the responses of 
both groups to areas of the assessment instrument. To examine those hypotheses, three 
primary assessment outcomes were analyzed: (a) inter-rater reliability, (b) characteristics 
of formal elements in normative statistics, and (c) differences between the mean scores of 
the two different cultural groups on each of the FEATS scales. 
39 
 
An independent t-test was used to determine if there were statistical differences 
between the responses of the two different cultural groups on the scales of the FEATS. 
Statistical significance was determined based on a .05 alpha level. In addition, normative 
statistics for the two groups were collected using minimum, maximum, mean, and 
standard deviation scores for each of the FEATS scales. Lastly, the Pearson correlation 
(Pearson’s r) was used to determine inter-rater reliability of the FEATS assessment in a 
cross-cultural setting. Inter-rater reliability helped indicate whether the FEATS 
instrument, used with the PPAT, was a reliable assessment tool in a cross-cultural setting. 
Inter-rater reliability was determined based on a .05 alpha level. 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40 
 
CHAPTER 3 
RESULTS 
A total of 114 PPAT drawings were collected with equal numbers from each 
cultural group: 57 PPAT drawings from the Asian sample group, and 57 PPAT drawings 
from the American sample group. Each PPAT drawing was rated once by each group of 
raters, resulting in scores for each drawing from both an American and Asian rater. For 
each drawing, two sets of 14 numeric variables were obtained, representing the scores for 
each of the 14 FEATS scales. The numeric data was analyzed using two different 
statistical tests, independent t-test and Pearson correlation (Pearson’s r) and basic 
statistical techniques to answer each of the three research hypotheses. 
Inter-Rater Reliability 
Once each group of raters had rated each of the 114 PPAT drawings, I utilized 
the Pearson correlation (Pearson’s r) to examine inter-rater reliability for the FEATS for 
the two cultural groups of raters. Table 2 demonstrates the inter-rater reliability 
correlations of the 14 individual FEATS elements in a cross-cultural setting. These data 
indicate strong (p ≤ .01) to moderate inter-rater reliability (p ≤ .05) on all of the FEATS 
scales except one, Rotation. Strong inter-rater reliability correlations was found for 11 of 
14 FEATS scales and two additional scales, Developmental Level and Line Quality, 
achieved lower but still statistically significant correlations. These results were consistent 
with previous research which reported inter-rater reliability on the majority of individual 
FEATS scales (Gantt & Tabone, 1998; Munley, 2002; Nan &Hinz, 2012; Rockwell & 
Dunham, 2006). These results support the hypothesis that reliability of the FEATS 
assessment instrument is found in a cross-cultural context. 
41 
 
Table 2 
Inter-Rater Reliability Correlations for the FEATS 
 
FEATS Scale Pearson Correlation r Significance (2-tailed) 
Prominence of Color .811*** .000 
Color Fit .652*** .000 
Implied Energy .859*** .000 
Space .712*** .000 
Integration .491** .001 
Logic .484** .001 
Realism .447** .002 
Problem-Solving .398** .006 
Developmental Level .325* .028 
Details of Objects and Environment .870*** .000 
Line Quality .336* .022 
Person .553*** .000 
Rotation -.248 .096 
Perseveration .595*** .000 
Note. *p < .05. **p < .01. ***p < .001.  
 
 
 
 
 
 
 
 
 
42 
 
Normative Data 
I collected normative statistics for the two cultural group using minimum, 
maximum, mean, and standard deviation scores for each of the FEATS scales. Table 3 
presents the normative statistics of each FEATS scale for the Asian and American sample 
groups. These statistical results represent the average of the two ratings per item, one 
from each rater scoring the drawing. 
As Table 3 demonstrates, the mean scores for the Asian sample group ranged 
from a low score of 2.97 on Prominence of Color to a high score of 4.73 on Perseveration. 
For the American sample group, the mean scores ranged from a low score of 2.82 on 
Prominence of Color to a high score of 4.82 on Perseveration. In general, the majority of 
the mean scores on the 14 FEATS scales for each cultural group are consistent with the 
original researchers’ (Gantt & Tabone, 1998) prediction of formal elements for a 
normative sample. These results support the hypothesis that there is consistency between 
normative statistics obtained in this research and the originators’ prediction about non-
 patient drawings. 
However, the mean score on the Developmental Level scale was inconsistent 
with the originators’ prediction. Gantt and Tabone (1998) predicted non-client or non-
 patient drawings would score at the adolescent developmental level, which is 4.0 or 
higher on the Likert scales of 1 to 5. The mean scores for the Developmental Level were 
3.58 for the Asian sample group, ranging from 2.0 to 4.5, and 3.72 for the American 
sample group, ranging from 3.0 to 5.0. Therefore, mean scores for artistic development 
found in both cultural groups were lower than the original prediction.  
 
43 
 
Table 3 
Normative Statistics of the 14 FEATS Scales for Asian and American Groups 
 
 Asian (N=57) American (N=57) 
Scale Min Max Mean SD Min Max Mean SD 
Prominence of color 2.0 4.0 2.97 0.42 2.0 4.5 2.82 0.48 
Color Fit 3.0 5.0 4.18 0.69 3.0 5.0 4.20 0.60 
Implied Energy 2.0 5.0 3.67 0.92 2.0 4.0 3.47 0.58 
Space 2.5 5.0 3.95 0.73 2.0 5.0 3.97 0.72 
Integration 3.5 5.0 4.45 0.45 3.0 5.0 4.08 0.56 
Logic 3.0 5.0 4.39 0.52 3.0 5.0 4.21 0.46 
Realism 3.0 4.5 3.52 0.38 3.0 4.0 3.39 0.35 
Problem-Solving 3.0 5.0 4.32 0.47 3.0 5.0 4.21 0.46 
Developmental Level 2.0 4.5 3.58 0.50 3.0 5.0 3.72 0.44 
Details of Objects & 
Environment 
2.0 5.0 3.60 1.04 2.0 5.0 3.84 0.97 
Line Quality 3.0 5.0 4.02 0.56 2.5 5.0 4.21 0.62 
Person 2.0 5.0 4.08 0.71 3.0 5.0 4.29 0.50 
Rotation 1.5 5.0 4.65 0.70 3.5 5.0 4.76 0.30 
Perseveration 3.5 5.0 4.73 0.47 3.5 5.0 4.82 0.36 
 
 
 
 
 
 
 
 
44 
 
Table 3 demonstrates similarity between the mean scores on each of the FEATS 
scale for the Asian sample group and the American sample group. In addition, the 
majority of the mean scores on each of the FEATS scales from the two cultural groups 
fell in line with normative statistics gathered in previous research (Bucciarelli, 2011; Nan 
& Hinz, 2012; Oh, 2013). In Bucciarelli’s (2011) and Nan and Hinz’s (2012) normative 
studies, the Prominence of Color scale was the lowest rated variable (M = 3.14 in 
Bucciarelli’s (2011) study; M = 2.68 in Nan and Hinz’s (2012) study). The Perseveration 
and Rotation scales were the highest rated variables in their studies (Bucciarelli, 2011; 
Nan & Hinz, 2012). For both cultural groups, the Prominence of Color scale was also the 
lowest rated variable and the Perseveration and Rotation scales were also the highest 
rated variables. These results support the hypothesis that there is consistency between the 
normative statistics obtained in this research and those from previous research. 
Independent t-test 
There were 114 mean scores for the Asian and American sample groups on each 
of the 14 FEATS scales, with equal numbers from each sample group: 57 mean scores for 
the Asian sample group and 57 mean scores for the American sample group on each of 
the scales. The dependent variable was the mean scores and the independent variable was 
the participants’ cultural background. I analyzed the data using an independent t-test to 
determine if there were statistical differences between the mean scores of the Asian 
sample group and the American sample group on each of the FEATS scales. 
Table 4 demonstrates t-test results of statistical differences between the mean 
scores of the Asian and American sample groups on each of the FEATS scales. As Table 
4 presents, no significant statistical difference was found in the mean scores between the 
45 
 
Asian sample group and the American sample group on all of the FEATS scales at 
significance level of .05, except for the scale of Integration. There was a significant 
difference (t(112) = 2.42, p = .020) in the mean score of the Integration scale between the 
drawings of the Asian sample group (M = 4.45, SD= .45) and the American sample group 
(M = 4.08, SD = .56). In addition, the study revealed some difference in the Prominence 
of Color scale (t(112) = 1.85, p = .067) , the Logic scale (t(112) = 1.89, p = .061), the 
Realism scale (t(112) = 1.80, p = .074), and the Person scale (t(112) = -1.73, p = 0.85), 
but these differences were not enough to be significant (p < .05). These results supported 
the hypothesis that there is no difference in the mean scores of the two cultural groups, 
Americans and Asians, on the majority of FEATS scales. 
 
 
 
 
 
 
 
 
 
 
 
 
 
46 
 
Table 4 
t-test Results of the 14 FEATS Scales by the Asian and American Sample Group 
 
FEATS Scale t df p r 
Prominence of color 1.851 112 .067 .172 
Color Fit -.072 112 .943 .006 
Implied Energy .410 112 .682 .038 
Space .251 112 .803 .023 
Integration 2.422 112 .020* .223 
Logic 1.894 112 .061 .176 
Realism 1.800 112 .074 .167 
Problem-Solving 1.282 112 .202 .120 
Developmental Level -1.567 112 .120 .146 
Details of Objects & Environment -1.158 112 .249 .108 
Line Quality -1.519 112 .132 .140 
Person -1.738 112 .085 .162 
Rotation -1.033 112 .304 .097 
Perseveration -1.337 112 .184 .125 
Note. *p < .05. **p< .01.  
 
 
 
 
 
 
 
 
 
47 
 
CHAPTER 4 
DISCUSSION 
 In an increasingly multicultural society, there have been serious concerns that 
many psychological assessments derive from a set of cultural assumptions and 
constructions that are uniquely Euro-American in origin and serve as possible agents of 
cultural imperialism to marginalize and stigmatize minority groups with flagrant labels of 
mental illness (Johnson, 2001; Reynolds & Suzuki, 2012). In particular, art-based 
assessments have become a center of controversy due to their unique nature; although art-
 based assessments have been considered less culturally bound (Hocoy, 2002; Williams, 
French, Picthall-French, and Flagg-Williams, 2011) due to an assumption of universality 
of art as form of communication and expression (Dissanayeke, 1995; Robin, 1999,) there 
has been a lack of research and empirical evidence to support their cross-cultural 
reliability (Betts, 2013; Feder & Feder, 1998; Hocoy, 2002). The lack of empirical 
evidence to support the assumptions of art-based assessments causes controversy 
regarding their cultural reliability. 
 The purpose of this study was to contribute to the literature on the cross-cultural 
use of art-based assessment by looking at one art therapy assessment in particular, the 
Formal Elements Art Therapy Scale (FEATS) used with Person Picking an Apple from a 
Tree (PPAT) instrument. The research was designed to identify whether the FEATS 
instrument in coordination with the PPAT (Gantt & Tabone, 1998) would be a reliable art 
therapy assessment in a cross-cultural context, and to establish empirical support for the 
development of normative data on the cross-cultural application of the FEATS scales for 
two different cultural groups: American and Asian college students. The outcomes of this 
48 
 
study supported the cross-cultural use and reliability of the FEATS with PPAT as an art 
therapy assessment tool. The results should promote a more comprehensive 
understanding of cross-cultural applications of the FEATS. 
Hypothesis One 
 The primary researcher hypothesized that there would be cross-cultural reliability 
of the assessment instrument, the FEATS scale, between groups of Asian raters and 
American raters. This research supports the hypothesis by demonstrating strong (p ≤ .01) 
to moderate inter-rater reliability (p ≤ .05) on 13 of the 14 FEATS scales. In each group 
of raters, four members were graduate students from a department of psychology and the 
FEATS assessment was unknown to them prior to their participation in this study. The 
American professional rater knew very little about art therapy and art therapy assessment, 
whereas the Asian professional rater was from a department of art therapy in South Korea 
and was very familiar with the FEATS. The high rates of inter-rater reliability found in 
this study suggest the FEATS is a reliable assessment in a cross-cultural context, and it 
has potential to be adopted by professionals with various training backgrounds. 
 With only one exception, the results presented statistically significant interrater 
reliability for the 14 individual FEATS scales in a cross-cultural context. The Rotation 
scale, however, did not show significant interrater reliability. The original pilot studies of 
the FEATS (Gantt & Tabone, 1998), Nan’s and Hinz’s study (2013), and Oh’s pilot study 
(2013) also demonstrated no significant interrater reliability on the Rotation scale, 
suggesting that it is a difficult scale to rate. Figure 1 is an example of the tilted human 
figure or tree depicted in the participants’ drawings.  
 
49 
 
 
Figure 1. Example Showing the Rating Problem for the Rotation Scale 
 
 
 
 
 
 
 
 
 
 
 
50 
 
According to the FEATS manual (Gantt & Tabone, 1998), a tilted figure or tree 
should be rated between 3 and 5 on the Rotation scale, but a few of the graduate student 
raters, in particular Asian graduate student raters, rated these images between 1 and 2 
because they judged that the tilted figures and trees were depicted with logical reasons for 
the deviation, such as lengthening the body. Therefore, ratings between the two groups of 
raters for some drawings demonstrated extremes, which significantly lowered the 
interrater reliability on the Rotation scale.   
Hypothesis Two 
The researcher hypothesized that normative statistics obtained for the two 
cultural groups in this study would be consistent with the originators’ (Gantt & Tabone, 
1998) predictions about non-patient drawings and with normative statistics obtained in 
previous research (Bucciarelli, 2011; Nan and Hinz, 2012). This study supported this 
hypothesis by demonstrating normative statistics for the two cultural groups consistent 
with the predictions. According to the originators (Gantt & Tabone, 1998,) normative, 
baseline assessment data would derive from drawings that: 
have reasonably upright figures, have colors appropriate to the subject matter, 
depict a fairly realistic person, have an integrated composition, have a good line 
quality, have control over lines and elements, have the reasonable problem-
 solving strategy, would be logical, and show at least the developmental features 
common to adolescent drawings (p. 55). 
Gantt and Tabone (1998) predicted non-client or non-patient drawings would 
score, on average, 4.0 or higher on the Likert scales of 1 to 5 for most of the above 
assumptions. As Tables 2 and 3 demonstrated, statistical normative results for both 
51 
 
cultural groups confirmed nearly all of the above assumptions about non-patient drawings. 
In both cultural groups, seven scales of the FEATS corresponded with the predictions of a 
score above 4.0 (refer to Tables 2 and 3), including (a) appropriate color use (The Color 
Fit scale); (b) well-integrated composition (The Integration Scale); (c) reasonably upright 
figures (The Rotation Scale); (d) realistic and reasonable depiction of a person (The 
Person Scale); (e) logical elements (The Logic Scale); (f) the reasonable problem-solving 
strategy (The Problem-Solving Scale); and (g) control over lines and elements in 
drawings (The Perseveration Scale). 
With only one exception, the results demonstrated consistency between 
normative statistics obtained from this study and the originator’s predictions. In the 
originators’ (Gantt & Tabone, 1998) study, they predicted non-patient drawings would 
score, on average, 4.0 or higher on the Developmental Level Scale. In this study, the 
mean scores on the Developmental Level scale for the Asian sample group (M = 3.58) 
and the American sample group (M = 3.72) did not correspond with the originators’ 
prediction. This difference may be due to a notable degree of playfulness and creativity 
expressed in the drawings for this study. Many drawings in this study included playful 
and creative characteristics, such as arms drawn as extending from the head or neck of 
the person, X-ray body parts, flowing or flying figures, or unrealistic person’s size (refer 
to Figures 2 and 3). These characteristics are considered to correspond to latency-age or 
child stage of artistic development (Gantt & Tabone, 1998; Lowenfeld & Brittain, 1987). 
As a result, many drawings with these characteristics were rated between latency-age and 
adolescent stages of artistic development, which is lower than a score of 4.0 on the 
Developmental Level scale.  
52 
 
 
Figure 2. PPAT Drawing Demonstrating an Unrealistic Person’s Hand Size  
 
53 
 
 
Figure 3. Example of Creativity and Playfulness in a PPAT Drawing 
 
 
 
 
 
 
 
 
 
 
 
54 
 
Participants from both cultural groups did not receive feedback about their 
images, nor were they clinically assessed on the basis of their drawings. Therefore, 
participants might have had less concern for what other people thought about their 
drawings and thus might have been inclined to playfully and creatively express 
themselves. 
In addition, the reason that the mean score on the Developmental Level scale was 
lower than the originator’s prediction may be due to the level of raters’ art therapy 
training or experience. In this research, all raters except the Korean faculty from the 
department of art therapy may not have had exposure to graphic indicators of 
development because of their academic backgrounds. According to the Gantt and Tabone 
(1998), raters with art training may have rated the drawings on the Developmental Level 
more accurately than raters without art training. Theoretically, raters with art training may 
have a better understanding of the stages of artistic development and would consider the 
characteristics of each drawing as a whole to score the Developmental Scale. In this study, 
there were several drawings that could have been rated higher on the Developmental 
Level; those drawings included many characteristics of an adolescent drawing level, such 
as a relaxed schema of person and depth perception, but with one or two qualities of a 
latency-age developmental level, such as discontinuous lines. However, most raters 
judged that these drawings were at latency-age developmental level, and therefore scored 
these drawings lower than 4.0 because they only paid attention to characteristics 
representing a latency-age developmental level. This result suggests that the accuracy of 
the Developmental Level scale may be influenced by the level of the raters’ art therapy or 
art education training. 
55 
 
This study also supports the hypothesis that the majority of the mean scores on 
each of the FEATS scales from the two cultural groups would fall in line with recent 
normative statistics gathered in previous research (Bucciarelli, 2011; Nan & Hinz, 2012; 
Oh, 2013). However, despite these similarities, the overall variability and standard 
deviations of the 14 FEATS scales for the Asian sample in Nan and Hinz’s study (2012) 
were higher than that of both the Asian and American sample group in this study. This 
difference in the variability and standard deviations between Nan and Hinz’s study (2012) 
and this research may be due to a fatigue effect that raters in this study may have 
experienced. In Nan and Hinz’s study (2012), raters had 51 PPAT drawings to rate, but 
with a total of eight raters, each rater was asked to rate only 12 or 13 of the PPAT 
drawings. In this study, on the contrary, each rater was asked to rate 22 or 24 of the PPAT 
drawings, which was twice the number of drawings rated by each rater in Nan and Hinz’s 
study. Therefore, the raters in Nan and Hinz’s study may have been less fatigued than the 
raters in this study, thereby maintaining their concentration throughout the rating session; 
thus, they may have found subtle differences more accurately in each of the drawings and 
given a larger range of scores to each of the FEATS scales. This would increase the 
overall variability and standard deviation, whereas the raters in this study may have been 
less sensitive to small differences in each drawing and used a smaller range of scores on 
each of the FEATS scales, which lowered the overall standard deviation with little 
variability. 
Hypothesis Three 
The researcher hypothesized that there would be no difference in the scores for 
the two cultural groups on the majority of FEATS scales. This study supported this 
56 
 
hypothesis by demonstrating no statistically significant difference between the mean 
scores of the Asian sample group and the American sample group on 13 of 14 of the 
FEATS scales. In the original pilot study of the FEATS, Gantt and Tabone (1998) 
hypothesized that the FEATS assessment used with the PPAT drawing had great potential 
for cross-cultural reliability and utility, because the assessment focuses on how people 
draw rather than what they draw. Therefore, the high rates of statistical consistency found 
in this study empirically supported the originators’ assumption about cultural reliability 
and usefulness of the assessment, and suggests that the FEATS with the PPAT drawing is 
a reliable and useful assessment in a cross-cultural context. 
However, despite the consistency between the scores of the Asian and the 
American sample groups on the majority of the FEATS scales, there was a significant 
difference found on one scale, Integration (t(112) = 2.42, p = .020). The mean score of 
the Asian sample group (M = 4.45) on the scale was higher than that of the American 
sample group (M = 4.08). The majority of drawings from the Asian sample group 
included more than one person and these people had a close visual relationship to each 
other and with other elements, such as trees or houses, in the drawing as presented in 
Figure 4. According to the FEATS (Gantt & Tabone, 1998), these drawings that included 
well-balanced relationships between three or more elements, rather than just a person and 
tree in the drawing, should receive a rating between 4 and 5 on the Integration Scale. 
Therefore, many drawings from the Asian sample group were rated 4 or 5 on the 
Integration Scale, which established a significant difference from the mean scores of the 
American sample group.  
 
57 
 
 
Figure 4. Example of an Asian Participant’s Drawing Demonstrating a Close Visual 
Relationship between Persons, Tree, and House 
 
 
 
 
 
 
 
 
 
 
58 
 
This difference on the Integration Scale may be due to a language barrier with 
Asian participants. The direction for this research was provided in English for both 
cultural groups: “Draw a picture of a person picking an apple from a tree.” For all of the 
Asian participants, English was not their mother language; thus, the Asian participants 
may have been less concrete and strict about the direction, disregarding the articles “a” or 
“an” that preceded “person” or “tree” in the direction. Therefore, the Asian participants 
may have been inclined to draw more than three elements, spontaneously generating a 
well-integrated composition. 
In addition, the significant difference on the Integration Scale may indicate 
difference in cultural worldviews between the Asian and American sample groups. The 
scale of Integration measures the degree to which the items in the picture are balanced 
into a cohesive whole. In other words, the scale indicates the degree to which individuals 
focus on “relationship” among figures and items in the drawings. In general, Eastern 
society is known for stressing a collectivistic orientation that considers the world as a 
massive field composed of complicated relationships among subjects and objects (Miilke, 
2007; Selin, 2003). However, Western society is considered to be based on the 
philosophy of individualism, which is a worldview that places the center focus on each 
separate figure and object in the world (Miilke, 2007; Montet, 1989; Selin, 2003). In this 
study, several American participants presented a focus on each separate element in their 
drawings by illustrating a limited relationship between elements. Figure 5 is an example 
of an American participant’s drawing demonstrating a focus on separate elements rather 
than a relationship between the two; although the person and tree were close to each other, 
the person was standing on the ground with an arm extended but staring at the viewer, not  
59 
 
 
Figure 5. Example of an American Participant’s Drawing Demonstrating a Focus on 
Separate Elements rather than a Relationship between the Person and Tree 
 
 
 
 
 
 
 
 
 
 
60 
 
at the apple in the tree. Potentially, the difference between the two cultural worldviews 
may have influenced the significant difference on the Integration Scale between the Asian 
and American sample groups. 
Challenges 
The primary challenge in executing the research design was gathering a pool of 
American participants as the Psychology Department pool was unavailable for this study. 
Another challenge was the recruitment of an American faculty member as a rater. Since 
the faculty members from the department of counseling and psychology were unavailable 
due to their busy schedules, or their service on my thesis committee and familiarity with 
the research questions and hypotheses, I had to look for an alternative rater as an 
equivalent to a faculty member. As a result, a professional counselor was chosen to 
participate as a rater. Since there was always a possibility that a researcher confronts 
unexpected challenges in collecting participants and data, I hope that the challenges in 
my research can be viewed as references for future researchers to prepare for unexpected 
events in data collection procedures. 
Limitations    
Despite measures to control the research outcomes and reduce biases, there were 
several limitations in this research. Participants from each cultural group were selected 
from a mid-sized university in the mid-western United States via a convenience samples 
comprised of undergraduate students for feasibility; this limited generalizability and 
reliability. The relative small sample size may also limit reliability and generalizability of 
this study. Even though the sample size in this study, a total of 114 participants, was 
bigger than the sample sizes of the previous normative studies (Bucciarelli, 2011; Nan & 
61 
 
Hinz, 2012; Oh, 2013,) it is likely still too small to use as norm groups for the two 
cultural groups or to generalize the results.  
In addition, this study may not be completely representative of Western and 
Eastern cultures. American undergraduate students do not represent all Western cultures 
and values, and Asians participants consisting of Chinese, Japanese, and Korean students 
may do not represent all Asian cultures and values. Future studies may need to include 
diverse ethnic collections in Western and Asian group samples. Finally, participants were 
asked to self-report any mental health disorders on the demographic questionnaire; 
however, the sample may have inadvertently included drawings from participants with 
unreported or undiagnosed symptoms, which may reduce the reliability of this study. 
Future Implications and Conclusion 
 This study provided preliminary normative data to empirically support the cross-
 cultural reliability and utility of the FEATS with PPAT drawings as an art therapy 
assessment tool. Further research is needed to verify and strengthen the results. To 
replicate this study with reliable results, future studies need to include random samples 
from a variety of geographic locations, with participants of diverse ages and 
socioeconomic backgrounds. Although this study supported the utility and reliability of 
the FEATS between two distinct cultural groups, Asian and American, additional studies 
of different cultural groups with larger sample sizes will be essential for establishing 
reliable normative data to indicate the cross-cultural reliability of the FEATS and PPAT. 
In addition, a few raters in this study reported that they were confused about the criteria 
used to rate the Rotation Scale. Therefore, it would be valuable for further research to test 
whether there is significant interrater reliability on the Rotation scale if raters are trained 
62 
 
with an objective tool, such as a diagram correlating the degree of angles relative to the 
vertical axis with particular scores. Finally, it would improve the rigor of future 
normative studies if researchers require all raters to rate drawings using optimal practices 
that reduce the chance for fatigue. 
 This research supported the development of larger normative studies for cross-
 cultural use of the FEATS with PPAT drawings. The findings suggest that the FEATS is a 
reliable and useful art therapy assessment in a cross-cultural context, and that it has the 
potential to be adopted by professionals with various cultural and training backgrounds. 
The establishment of a normative baseline promotes more comprehensive understanding 
of cross-cultural applications of the FEATS, but also increases the value of the FEATS 
with PPAT assessment in clinical work and in research. It is my hope that this research 
will inspire art therapists and clinicians in many parts of the world to contribute to the 
growing body of normative data on the Formal Elements Art Therapy Scale and Person 
Picking an Apple from a Tree assessment. 
 
 
 
 
 
 
 
 
 
63 
 
REFERENCES 
Acton, D. (2001). The “color blind” therapist. Art Therapy: Journal of the American Art  
Therapy Association, 18(2), 109-112. 
American Art Therapy Association (2015). About art therapy. Retrieved from   
www.arttherapy.org. 
American Psychiatric Association (2000). Diagnostic and statistical manual of mental  
disorders (4th ed., text rev.). Washington, DC: Author. 
American Psychiatric Association (2013). Diagnostic and statistical manual of mental  
disorder (5th ed.). Washington, DC: Author. 
Amos, S. (1982). The diagnostic, prognostic, and therapeutic implications of  
schizophrenic art. The Arts in Psychotherapy, 9, 131-143. 
Arieti, S. (1976). Creativity: The magic synthesis. New York, NY: Basic Books. 
Arrington, D., & Yorgin, P. D. (2001). Art therapy as a cross cultural means to assess  
psychosocial health in homeless and orphaned children in Kiev. Art Therapy: 
Journal of the American Art Therapy Association, 18(2), 80–88. 
Arrington, D.B. (2005). Global art therapy training- Now and before. The Arts in  
Psychotherapy, 32(3), 193-203  
Benhabib, S. (2012). The Claims of Culture: Equality and diversity in the global era.  
Princeton, NJ: Princeton University Press. 
Betts, D. (2013). A review of the principles for culturally appropriate art therapy  
assessment tools. Art Therapy: Journal of the American Art Therapy Association, 
30(3), 98-106. 
Bucciarelli, A. (2011). A normative study of the Person Picking an Apple from a Tree  
64 
 
(PPAT) assessment. Art Therapy: Journal of the American Art Therapy 
Association, 28(1), 31-36. 
Calisch, A. (2003). Multicultural training in art therapy: Past, present, and future. Art  
Therapy: Journal of the American Art Therapy Association, 20(1), 11-15. 
Cheryl, D.C. (2006). Cultural diversity curriculum design: An art therapist's perspective.  
Art Therapy: Journal of the American Art Therapy Association, 23(4), 172-180. 
Cohen, J. (1998). Statistical power analysis for the behavioral sciences. London:  
Lawrence Erlbaum. 
Cuneo, K., & Welsh, M. (1992). Perception in young children: Developmental and  
neuropsychological perspectives. Child Study Journal, 22, 73-92. 
Cruz, R.F. (2005). Introduction to special issue: The international scope of arts therapists.  
The Arts in Psychotherapy, 32, 167-169. 
Debra, L.K., Siu, M.C., & Potash, J.S. (2012). Art Therapy in Asia: To the bone or  
wrapped in silk. London: Jessica Kingsley Publishers. 
Dissanayake, E. (1995). Homo Aestheticus: Where art come from and why. Seattle, WA:  
University of Washington Press. 
Feder, B., & Feder, E. (1998). The art and science of evaluation in the arts therapies:  
How do you know what is working? Springfield, IL: Charles C Thomas. 
Evans, C. (1984). “Draw a Peson…a Whole Person:” Drawings from psychiatric  
patients and well-adjusted adults as judged by six traditional DAP indicators, 
licensed psychologists and the general public. Temple University, Philadelphia, 
PA. 
Ganntt, L., & Tabone, C. (1998). The Formal Elements Art Therapy Scale: A rating  
65 
 
manual.Morgantown, WV: Gargoyle Press. 
Gantt, L. (1990). A validity study of the Formal Elements Art Therapy Scale (FEATS) for  
diagnostic information in patients’ drawings. Pittsburgh, PA: University of  
Pittsburgh. 
Gantt, L. (2001). The Formal Elements Art Therapy Scale: A measurement system for  
global variables in art. Art Therapy: Journal of the American Art Therapy 
Association, 18(1), 50–55. 
Gantt, L. (2004). The Case for Formal Art Therapy Assessments. Art Therapy: Journal of  
the American Art Therapy Association, 21(1), 18-29. 
Groth-Marnat, G. (1999). Current status and future directions of psychological  
assessment: Introduction. Journal of Clinical Psychology, 55(7), 781-785. 
Goodwin, C. J. (2007). Research in psychology: Methods and design (5th ed.). New York,  
NY: John Wiley & Sons. 
Gussak, D. (2004). Art therapy with prison inmates: A pilot study. The Arts in  
Psychotherapy, 31(4), 245–259. 
Gussak, D. (2006). Effects of art therapy with prison inmates: A follow-up study. The  
Arts in Psychotherapy, 33(3), 188–198. 
Gussak, D. (2007).The effectiveness of art therapy in reducing depression in prison  
populations. International Journal of Offender Therapy and Comparative 
Criminology, 51(4), 444–460. 
Gutmann, A. (2003). Identity in Democracy, Princeton, NJ: Princeton University Press. 
Hocoy, D. (2002): Cross-cultural issues in art therapy. Art Therapy: Journal of the  
American Art Therapy Association, 19(4), 141-145. 
66 
 
Hollinger, D. (1995). Postethnic America: Beyond multiculturalism. New York, NY:  
Basic Books. 
Johnson, Z. (2001). Cultural competency and humanistic psychology. Humanistic  
Psychologist, 29, 204-222. 
Jung, J. S., & Kim, G. S. (2010). A study on the responsive characteristic of the FKD by  
the depression level of university students [Korean with English summary]. 
Korean Journal of Art Therapy, 17(3, SN. 48), 633-647. 
Kaiser, D. H., & Deaver, S. P. (2009). Assessing attachment with the Bird’s Nest  
Drawing: A review of the research. Art Therapy: Journal of the American Art 
Therapy Association, 26(1), 26–33.  
Kaplan, F.F. (2003): The paradox of multiculturalism. Art Therapy: Journal of the  
American Art Therapy Association, 20(1), 2-2. 
Kymlicka, W. (1989). Liberalism, Community, and Culture. Oxford: Oxford University  
Press. 
Lowenfeld, V. (1939). The nature of creative activity. New York, NY: Harcourt Brace. 
Lowenfeld, V. (1947). Creative and mental growth. New York, NY: Macmillan. 
Lilienfeld, S. O. (1999). Projective measures of personality and psychopathology: How  
well do they work? Skeptical Inquirer, 23(5), 32-39. 
Malchiodi, C. A. (1998). The art therapy sourcebook: Art making for personal growth,  
insight and transformation. New York, NY: McGraw-Hill. 
McNiff, S. (1984). Cross-cultural psychotherapy and art. Art Therapy: Journal of the  
American Art Therapy Association, 1(3), 125–131. 
Miilke, Y. (2007). An Asiacentric reflection on Eurocentric bias in communication theory.  
67 
 
Communication Monographs, 74(2), 272-278. 
Montet, M. P. (1989). Europe’s spiritual origins. International Management, 44, 38-39. 
Moon, C.H. (2010). Materials & media in art therapy: Critical understandings of diverse  
artistic vocabularies. New York, NY: Routledge. 
Munley, M. (2002). Comparing the PPAT drawings of boys with AD/HD and age- 
matched controls using the Formal Elements Art Therapy Scale. Art Therapy: 
Journal of the American Art Therapy Association, 19(2), 69-76. 
Nan, J. KM., & Hinz, L. D. (2012). Applying the Formal Elements Art Therapy Scale  
(FEATS) to adults in an Asian population. Art Therapy: Journal of the American  
Art Therapy Association, 29(3), 127-132. 
Oh, S.B. (2013). A normative study for the cross-cultural use of the Person Picking an  
Apple from a Tree (PPAT) and the Formal Elements Art Therapy Scale (FEATS). 
Unpublished pilot study. Emporia State University, Kansas.  
Reynolds, C.R., & Suzuki, L.A. (2012). Bias in psychological assessment: An empirical  
review and recommendations. In Weiner, I.B., Graham, J.R., & Naglieri, J.A. 
(Eds.), Handbook of Psychology, Volume 10: Assessment Psychology (pp.82-113). 
New York, NY: John Wiley & Sons. 
Robertson, J. (1952). The use of colour in the paintings of psychotics. British Journal of  
Psychiatry, 98(410), 174-184. 
Rockwell, P., & Dunham, M. (2006). The utility of the Formal Elements Art Therapy  
Scale in assessment for substance use disorder. Art Therapy: Journal of the 
American Art Therapy, Association, 23(3), 104-111. 
Rubin, J. A. (1999). Art therapy: An introduction. Lillington, NC: Edward Brothers. 
68 
 
Russell-Lacy, S., Robinson, V., Benson, J., & Cranage, J. (1979). An experimental study  
of pictures produced by acute schizophrenic subjects. British Journal of 
Psychiatry, 134, 195-200. 
Selin, H. (2003). Nature across cultures: Views of nature and the environment in non- 
western cultures. New York, NY: Springer Publishing. 
Skovholt, T. M., & Rivers, D. A. (2007). Helping skills and strategies. Denver, CO: Love  
Publishing. 
Song, S. (2007). Justice, gender, and the politics of multiculturalism. Cambridge, MA:  
Cambridge University Press. 
Stoll, B. (2005). Growing pains: The international development of art therapy. The Art in  
Psychotherapy, 32(2005), 171-191. 
Silver, R. (2003): Cultural differences and similarities in responses to the Silver Drawing  
Test in the USA, Brazil, Russia, Estonia, Thailand, and Australia. Art Therapy:  
Journal of the American Art Therapy, Association, 20(1), 16-20. 
Sue, D. W., & Sue, D. (1999). Counseling the culturally different: Theory and practice  
(2nd ed.). New York, NY: John Wiley & Sons. 
Taylor, C. (1992). The politics of recognition in multiculturalism: Examining the politics  
of recognition. Princeton: Princeton University Press. 
University of California Los Angeles Academic Technology Services. (n.d.). SPSS FAQ.  
Retrieved from http://www.ats.ucla.edu/stat/spss/faq/alpha.html 
Vernier, C., Stafford, J., & Krugman, A. (1958). A factor analysis of indices from four  
projective techniques associated with four different types of physical pathology. 
Journal of Consulting Psychology, 22, 433-439. 
69 
 
Young, I. M. (1990). Justice and the Politics of Difference, Princeton, NJ: Princeton  
University Press. 
Wadeson, H. (1980). Art Psychotherapy. New York, NY: John Wiley & Sons. 
Wilkinson, A., & Schnadt, F. (1968). Human figure drawing characteristics: An empirical  
study. Journal of Clinical Psychology, 24, 224-226. 
Williams, R. B., French, L. A., Picthall-French, N., & Flagg Williams, J. B. (2011). In  
pursuit of the Aboriginal child’s perspective via a culture-free task and clinical 
interview. SIS Journal of Projective Psychology & Mental Health, 18(1), 22–27. 
Wolf Bordonaro, G.P. (2015). International art therapy. In Gussak, D.E., & Rosal, M.L.  
(Eds.), The Wiley-Blackwell Handbook of Art Therapy. New York, NY: Wiley-
 Blackwell. 
White, C., Wallace, J., & Huffman, L. (2004). Use of drawings to identify thought  
impairment among students with emotional and behavioral disorders: An  
exploratory study. Art Therapy: Journal of the American Art Therapy Association, 
21(4), 210– 218. 
  
70 
 
Appendix A 
Person Picking an Apple from a Tree (PPAT) 
 
  
71 
 
Person Picking an Apple from a Tree 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
72 
 
Appendix B 
Formal Elements Art Therapy Scales 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
73 
 
 
 
74 
 
 
 
75 
 
 
 
76 
 
 
 
 
77 
 
 
 
 
78 
 
 
 
79 
 
 
 
80 
 
 
 
81 
 
 
 
 
82 
 
 
 
83 
 
 
 
 
84 
 
 
 
85 
 
 
 
 
86 
 
Appendix C 
Demographic Questionnaire 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
87 
 
Demographic Questionnaire 
 
Age ______________   Gender ______________  
Cultural Background______________ (i.e: American, Hispanic, Asian, European.) 
Country of birth__________________ (i.e: USA, China, France, and so on.) 
Do you have any major medical conditions? If yes, please list 
________________________________________________________________________
 ________________________________________________________________________ 
Are you currently in any kind of mental health counseling or therapy? If yes, please 
describe 
________________________________________________________________________
 ________________________________________________________________________ 
Are you currently taking any psychotropic medications? _______________________ 
Where do you find yourself on the scale, from 1 to 5, regarding level of stress? 
__________________ (1= Not at all stressed, 2= Not very stressed, 3= Neutral, 4= 
Somewhat stressed 5= Very stressed) Please place a number on line above. 
If you score above 4, please describe the stressors that make you feel stressed 
 
 
  
88 
 
Appendix D 
Emporia State University IRB Letter of Approval 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
89 
 
 
 
 
 
90 
 
Appendix E 
Informed Consent 
 
 
 
 
  
91 
 
INFORMED CONSENT DOCUMENT 
The Department of Counselor Education at Emporia State University supports the 
practice of protection for human subjects participating in research and related activities. 
The following information is provided so that you can decide whether you wish to 
participate in the present study. You should be aware that even if you agree to participate, 
you are free to withdraw at any time, and that if you do withdraw from the study, you will 
not be subjected to reprimand or any other form of reproach. Likewise, if you choose not 
to participate, you will not be subjected to reprimand or any other form of reproach. 
 
This study will examine the Formal Element Art Therapy Scales (FEATS) through the 
use of the Person Picking an Apple from a Tree (PPAT) art directive. Study participation 
will take approximately 20 minutes. As a participant in this study you will participate in 
drawing a picture of a person picking an apple from a tree and completing a questionnaire.  
  
There are minimal known risks associated with participation. The purpose of this study is 
to establish normative data to support cross-cultural use of one art-based assessment by 
empirically examining its cross-cultural utility. As a result this research should provide 
the benefit to establish normative statistics to support for cross-cultural utility of the 
FEATS with the PPAT drawing and improve understanding of differences in the 
assessment in respect with cultural backgrounds. 
 
All completed study materials will be kept in a locked cabinet in the Earl Center on the 
Emporia State University Campus. Identifying information, such as name or birth date, 
will not be linked to specific study results. Some of PPAT drawings completed during this 
study will be photographed, no personal information will be written on the PPAT 
drawings. Study material and artwork may later be utilized in presentation or publication 
of the study. 
 
If you have questions or concerns please contact Seung Bin Oh, a graduate student in 
mental health counseling and art therapy counseling program and Primary Investigator on 
this study. He can be reached at 620-757-5719 or soh5@g.emporia.edu. You may also 
contact his chief committee and faculty advisor, Dr. Gaelynn P. Wolf Bordonaro; she can 
be reached at gwolf@emporia.edu or 620.341.5809. 
 
"I have read the above statement and have been fully advised of the procedures to be 
used in this project. I have been given sufficient opportunity to ask any questions I had 
concerning the procedures and possible risks involved. I understand the potential risks 
involved and I assume them voluntarily. I likewise understand that I can withdraw 
from the study at any time without being subjected to reproach." 
 
___________________________            _____________________________ 
Participant                               Date 
___________________________             _____________________________  
Parent or Guardian (if subject is a minor)       Date 
 
92 
 
Appendix F 
Formal Elements Art Therapy Scales Rating Sheet 
 
 
  
93 
 
 
 
 
 
 
94