BACK
HOME
NEXT

S

Sample is a small group drawn from a population which is considered to be representative of that population, so that the conclusions drawn on the sample could be applicable/valid for that population.  To select such a sample, a number of sampling procedures are available, of which random, stratified, etc., are the frequently used ones.

Sampling Error is the difference between the value obtained on a sample and that of the entire population.

Sampling Validity is a measure of content validity.  It is obtained by determining how far the test items are representative samples of the universe of behaviours that define the variable to be measured.

Scale is a measuring device that provides a set of standards for comparing the object/trait to be measured in order to quantify the magnitude by assigning a number or some mathematical value.  The term is of wide applicability.

Scatter Diagram is a chart used to determine the relationship between two variables. Vertical dimensions represent the scores on one test and the horizontal dimensions represent the other.  To reflect the pair of scores for a particular individual, a tally mark is entered on the diagram. This diagram serves the following two purposes :

1)      To determine if there is useful relationship between the two variables.
2)      To determine the type of equation to be used to describe the relationship.

Scatter Plot [See, Scatter Diagram].

Scedasticity is the relative variability of the rows and columns of a double entry table or scatter diagram. {Also see, Homoscedasticity}.

Schedule is a written or printed list.  It may be a detailed written plan for future procedure to be followed or an outline of regularly recurring events.  It may also refer to a form or outline used to guide data gathering. In evaluation, mostly it refers to a blank form with questions and space for responses or comments.  Sometimes, it would also refer to the procedures laid down for guiding an experiment.

Scholastic Abilities – both lower order (knowledge, comprehension and application) and higher order (analysis, synthesis and evaluation) abilities, are concerned with cognitive aspect of behaviour.  The term ‘cognitive’ (scholastic) usually refers to the behaviour in which there is a search for information (awareness).  Primary mental abilities (perceptual speed, verbal relations, word fluency, memory, etc.) are associated with this category of behaviour. The most important constituents of scholastic developments are the development of power of discrimination, reasoning, seeking relevance of knowledge to the real life situations, decision making, etc.  Most of our educational programmes are confined with the development of scholastic abilities only.  The achievement of these abilities can be assessed directly by observing the responses of the students how well they have acquired.  [Also see, Non-scholastic Traits].

Score or Mark is a number assigned to an examinee to provide a quantitative description of his performance on a particular test or examination.  40 out of 50, 75 out of 100, etc., are the examples.  A mark may also be expressed in percentage, i.e., 75%, 80%, etc.

Scoring Formula refers to the procedure for obtaining the raw score on the test from the number of correct, partially correct, incorrect and omitted responses.  Although various formulae are in use, ‘scores equal to the number right’ is considered to be the most convenient one.  Other methods require correction for guessing, which involves additional efforts to maintain accuracy.

Scoring Key refers to the right answer usually in the selection type items/questions of a test or examination. [Also see, Key].

Scoring Matrix is a response grid or table usually made up on a graph paper.  It is used for organizing and recording the responses of each individual on every item of a test battery.

Selection is the choice of an item/person for inclusion in some group, class or category. It is one of the six purposes of evaluation.  This process helps to choose most suitable persons for a programme or a class or a job.

Selection Type Item/Question refers to the objective type test items, viz., constant alternatives, multiple choice, multiple facet, matching and rearrangement types. Each of this type is provided with a number of options, out of which the correct answer has to be selected. In other words, the items of this category require an examinee to select the correct/most suitable answer from among the given options.

Self-Assertion refers to the tendency to press for the achievement of one’s own goals.

Sell-Assessment is an appraisal of one’s own personal qualities or traits, as measured by himself with the help of a behaviour checklist or the like. It is distinguished from the self-assessment questionary.

Self-Realisation is the stage of realizing one’s capability of mental and moral nature.  It is a basis for the attainment of character formation, the final level of the affective behaviour.

Self-Report is the report of an individual as visualized and furnished by himself. It may be a simple statement of personal facts (age, marital status, occupation, etc.) or an elaborate personality rating, questionary, an autobiography, etc.  It would help assess one’s line of thinking or visualizing various social situations.

Self-Report Inventory is a device/questionnaire used in personality measurement.  In this device, a person indicates a certain kind of behaviour that describes how he would react to certain imaginary situations.  It is distinguished from ‘self-inventory’ which is usually a list of traits and not of behaviours (conduct).

Semantic Differential is a common scale for eliciting affective responses.  It usually consists of 7-point ratings on a series of bipolar continua, such as strong-weak, poor-rich, active-passive, etc. here, the positions of positive and negative characteristics have been randomly reversed (e.g., the first column may be from positive to negative, i.e., strong-weak, but the next column may be negative to positive, i.e., poor-rich, etc.) to avoid ‘halo-effect’.  [See, Fig. in Appendix].

Seminar is often used as a technique for assessing the learners’ growth – both in the scholastic and non-scholastic aspects of behaviours.  It may be of different types, viz., paper presentation, panel discussion, symposium, etc. this technique is suitable for assessing the development of complex skills, such as application, analysis, synthesis and evaluation and also the personality traits like interests, attitudes, values, social skills, expression, etc.

Set is the second one among the seven hierarchical levels of the psychomotor behaviour.  It is a preparatory adjustment or readiness for a particular kind of action or experience.  Set takes place in three stages, namely, (1) Mental, (2) Physical and (3) Emotional.  As it happens to be a apart of learning objectives under the psychomotor domain of the TEO the process of which needs evaluation.

Shor Answer Question/Test is one which requires the examinee to produce brief, fragmental answers.  The answer may usually be a number, word, phrase or a sentence.

Simple Sampling is used for the situation in which a sample is to be drawn from a universe without any prior sub-division of the universe into groups of relatively similar units (i.e., strata).

Situational Test requires an examinee to react/respond to some problems or situations which are created artificially.  On the basis of his response/reactions to real life situations, non-scholastic aspects of his growth are assessed.  For example, to measure one’s honesty, he is put in an artificial situation which stimulates a high temptation to steal. The extent to which he resists the inclination to steal indicates the extent to which he is honest.  Traits, habits, attitudes and adjustments are some of the important characteristics which may be assessed through these tests.

Skew or Skewed Distribution refers to an asymmetrical distribution.  It indicates the nature of examination – whether too difficult or too easy.  In case of the examination being too difficult, most of the scores cluster near the zero point of the scoring range and the skew will be positive.  Whereas, in case of the examination being too easy, the distribution will form the negative skew which is usually not considered as a problem.  In case of criterion referenced tests, the distribution of this type (negative skew) is desirable, as it would indicate that a majority of the students have attained the objective of the course or programme. [Ref.: Diagrams in Appendix].

Skimming and Scanning are the skills of fast reading.  They are closely related to the higher order abilities of the scholastic category.  ‘Skimming’ is an organizational approach for learning about the ideas of an article or a book.  They are of three kinds, viz., pre-view, over-view and review. ‘Scanning’ is an effort to locate specific points, works, numbers, names or ideas or attempting to answer specific questions.  Possession of these skills is of very much helpful to the learners of advanced studies, researchers, teachers, reviewers, editors of magazines/dailies and professional executives to improve their efficiency in discharging the day-to-day responsibilities.

Slope or Regression Coefficient refers to the rate of ascent (rise) or descent (run) of the regression line from the horizontal plane, and is indicated by the letter b in the formula y = a + bx.  The slope is determined by dividing the number of horizontal units.  For example, the line rises four vertical units for every ten horizontal units, we say,

                        the slope           =          rise_    =          4_       =          0.4

                                                              run                 10

Slot is the gap arising in a series or sequences of words or phrases required to be filled up by examinee.  This may be contrasted with ‘Incomplete Statement Item’, ‘Cloze Test’, etc., for further better understanding.

Social Quality is one of the non-scholastic aspects of learners’ growth closely related to language education. Emphathic qualities (inter-personal relationship), social maturity, social cohesion, co-operation, leadership, etc., are some of the social qualities which a learner is supposed to develop especially at the higher education and thereby requires evaluation.

Sociogram is a graphical drawing used in the socio-metric method for determining one’s adjustability in the social context.  It uses certain symbols and marks to indicate the patterns of social acceptance or rejection.  The patterns thus provided data for value-judgement/decision.

Sociometric Method is also known as sociometry. It is a quantitative technique of measurement useful for determining the extent to which one’s social adjustability has been developed. The sociometric position of each student (viz., popular, star, isolated or neglected, etc.)  in a class can be obtained through this method.  Social adjustment is one of the important affective aspects of a learner, which needs evaluation.

Sociometric Status is also one of the non-scholastic aspects of learners’ growth closely related to higher education.  This refers to the status of an individual in the classroom or in an institution or in a society of which he or she is a part.  Star, popular, isolated, neglected cliques, etc., are some of the statuses which a learner usually develops in course of his/her education.

 

Sociometry  is a quantitative method for determining and describing the pattern of acceptances and rejections in a group of people.  [Also see, Sociometric Method and Sociometric Status].

Source Material refers to a pool of graphic and oral texts drawn for the purpose of developing tests. Properly selected such materials provide the purpose, the relevance, the coherence and the unity for the test.

Speaking Test attempts to measure the speech quality of an examinee.  As viewed by Vallete [1967, p.80], speaking is a social skill.  It is more than pronunciation and intonation.  At the functional level, speaking makes oneself understood.  At a more refined level, it requires the correct and idiomatic use of the language. Hence, speaking test in a broader sense, is expected to measure all the components of speaking.  Pronunciation, stress, fluency (free response), organization and comprehensive speaking, etc., may form parts of a speaking test [Also see, Oral Test, M’s –  The Three M’s].

Spearman-Brown Formula is used to predict the relationship between the reliability of a test and its length when additional items are included.  It however assumes that any item added to the test are of the same kind as those already present.  The following is the formula used for the above purpose:

                                    rn         =                  nrs_____

                                                              (n-1)  rs + 1

                        where,  rn          =          reliability of a test when adjusted to ‘n’ times its

                                                            original length.

                                    rs         =          the observed reliability of test at its present length.

                                    n          =          the number of times the length of the test is (e.g.,

for a test of 20 items, if 10 items are added ‘n’

would equal to 1.5) to be augmented.

Specific Determiner refers to certain characteristic in the statement of a true-false test item that supplies an unintended clue to the correct answer.  Ebel [1979] clarifies it with the following example.  Statements including the words ‘every’, ‘always’, ‘entirely’, ‘absolutely’, and ‘never’ are more likely to be false than true.  Similarly, statements containing the words ‘sometimes’, ‘usually’, ‘often’ and ‘ordinarily’ are more likely to be true than false.

Specific Learning Outcome is an intended outcome (objective) of the instructional process stated in specific behavioural terms.  [See, Objectives, Educational Objectives, Outcomes].

Specification of Needs refers to a structured statement of the communicative demands of a situation.  It may be of formal and elaborate.

Speededness of a test refers to the extent of an examinee’s quickness in working through it.  It is measured by the proportion of examinees who ‘do not’ reach the last item and those who answer upto the last item of the test.

Speed Test is one which consists of items that are usually easy, but the time allowed to respond is limited.  If  enough time is given, all the persons would respond correctly.  Here, examinees are compared on their speed of performance rather than on knowledge.  Speed test is usually contrasted with the power test.

Split-Halves (Reliability) is one of the methods for estimating internal consistency reliability.  It is obtained by using half the number of items on the test – either odd or even numbered.  Sometimes the first or second half of the test may also be used for this purpose.  Each half would yield a reliability score independently. The correlation between the scores obtained on these two halves of the test would be corrected with the help of the following formula :

                        rxx        =              2r  ½  ½

                                                -----------------

                                                  1 +  r  ½  ½

            where   rxx                   =          estimated reliability of whole test.

                        r ½ ½               =          reliability obtained by split-halves method

             the values of reliability estimated by using various methods may have slight variations among themselves.

Standard Deviation is a measure of variability of scores.  It indicates the dispersion or spread of a set of scores around their mean value.  It is equal to the positive square root of variance.  It is sometimes called ‘sigma’ and represented by the symbol ‘σ’.  Standard Deviation is defined as the root mean square of all deviations of individual scores over the arithmetic mean.  The formula for the computation of standard deviation is as follows:

                                     

                        σ        =       Σ (x - x# )2 f

                                  Ö           Σf

Where,             σ          =          standard deviation

                                    x          =          any mark or score

                                    x#       =          mean score

                                    Σ         =          a symbol meaning sum of

                                    f           =          frequency        

standard deviation is an important measure useful for the interpretation of test scores and would also provide a qualitative analysis of performance.  The teacher can prepare a mark frequency table and workout the standard deviation. In case the number of students is more, the following simplified formula of Pal Dietriech can be used to estimate standard deviation :

                        σ          =          Sum of 1/6th  highest    -   Sum of 1/6th  lowest

                                                            ½ the total number of students

             where, σ is the standard deviation.

Standard Error of the Mean provides an idea about the limits of marks within which the arithmetic mean will lie if the test is given over and again.  It is calculated by the following formula :

                        Standard error of the mean       =             S. D.__
                                                                   Ö    N-1

                         where, ‘N’ is the number of students
                         S.D. is the standard deviation.

Standard Error of Measurement is an estimate of the margin of error in an individual’s score due to the imperfect reliability of an instrument.  It is estimated by multiplying the standard deviation of the scores by the square root of one minus the reliability coefficient.  That is,

                                                                                     _____
                       Se        =          SEM    =          S.D.  Ö 1-rxx

                     Where, Se        =          Standard error of measurement.

                        S.D.     =          Standard deviation

                        rxx        =          reliability estimate (coefficient)

 

Standardisation refers to the process of administering a carefully constructed test to a large representative sample of examinees under standard conditions for the purpose of determining/establishing the norms.

Standardised Test is one that is constructed in accordance with detailed specifications.  It consists of empirical items with appropriate difficulty level selected on the basis of several tryouts over the normative samples (usually more than 1000 persons) under prescribed conditions.  Such tests are fully evaluated and capable of yielding the relevant information for necessary interpretation.  It would accompany a manual providing definite directions for uniform administration and procedures for marking, established norms, besides reliability and validity data.  The following are the steps for developing standardized tests :

1>    Collection of source materials

2>    Assembling of items

3>    Pilot testing

4>    Preliminary tryout

5>    Test and item analysis

6>    Standardisation (establishing norms)

7>    Establishment of reliability and validity

 

Standard Scores are the scores which are derived from the raw scores (obtained scores) in order to express an individual’s position/distance from the mean in terms of the standard deviation of the distribution. Z-score is considered to be the basic standard score and the formula for obtaining it is

 

                        Z          =            x  -  x##

    σ

 

             where,             x          =          any raw score

                                    x##     =          mean of a set of raw scores

                                    σ           =          standard deviation

             The other standard scores derived from Z-score are T-scores, Deviation IQs, Stanines and so on.   The following are the uses and advantages of standard scores :

1)      They represent equal units of measurement and facilitate comparison.

2)      Standard scores from different tests re comparable to the extent they may be combined.  This applies irrespective of the number of items and their difficulties.

3)      Facilitates interpretation in terms of ability or achievement and indicates relative standing in the group at the same time.

Stanine is a condensed form of T-scale.  ‘Stanine’ is originally derived from ‘standard nine’.  It is used to convert the raw scores into a single digit scores which range from 1 to 9 with a mean of 5 and a standard deviation of 2.  the percentages of cases at each Stanine level, as explained in the instructional material of the course on Evaluation Methodology and Examination (Intermediate – Lesson 15, p.15)are given below :

a)      Low (4%) = Stanine 1 (4%)

b)      Below average (19%) = Stanine 2 (7%) and 3 (12%)

c)      Average (54%) = Stanine 4 (17%), 5 (20%) and 6 (17%)

d)      Above average (19%) = Stanine 7 (12%) and 8 (7%).

e)      High (4%) = Stanine 9 (4%)

(See, Fig. in Appendix)

Stanine Score is a converted single digit standard score on a nine unit Stanine scale.  It is obtained on the basis of the raw score mean and standard deviation of the group.  It is equal to 2Z + 5 (rounded to the nearest whole number).

                        Z          =            x  -  x##

    σ

             It is also often referred as ‘Stanine Standard Score.  [Also see, ‘Stanine’].

Star is the learner who is recognized as the best in his group by virtue of his excellence in scholastic and non-scholastic aspects of behaviour.

Statistic refers to any value frequently used as an estimate of a population parameter, characterizing some aspects of a sample.  Ebel [1979] further clarifies it as the number of cases in the sample, the value of the measures in the sample, the standard deviation of those measures, and the correlation between two sets of measures for the members of the sample are statistics.

Statistical Quantities  in the area of evaluation, refer to the values estimated through test and item analysis, such as Mean, Median, Mode, Standard deviation, Standard error of mean, Question indices like Choice Index, Facility value, Discrimination index, Mean ability index, Effectiveness of distractors, etc., are the Statistical Quantities.

Status Validity is the correspondence between test scores and the concurrent status of testees, i.e., between a test of leadership and actual status in position.  It is a form of concurrent validity.

Stem is the initial part of an objective test item – either a partial sentence to be completed, a question to be answered (from among the given options), or several statements leading to a question or an incomplete phrase.

Step Ladder Test is one of the computer adaptive tests.  It consists of items that are pre-analysed, calibrated and arranged in the order of difficulty.  The computer algorithm would select an entry level set of items for a given examinee.  Based on the experience of the examinee with a given set of items, a more appropriate set would be presented to him as a next step. And then onwards other sets of items will be presented one by one in different stages of the difficulty continuum by the computer.

Strategies in evaluation are the means of achieving objectives of the plan.  Once a decision on the issues of ‘objective setting’ is over, then the formulation of strategies would begin.  Strategy formulation anticipates formulation of tools and techniques.  Different strategies are to be adopted for different purposes of evaluation and testing. For example, the strategies adopted for recruiting candidates for Banks cannot be made applicable to Defence organizations.  Therefore, strategies, assume importance as a tool in the programme of evaluation.

Stratified Random Sampling is one of the widely used sampling procedures for the selection of samples from a population of complex nature.  In this, the population is divided into a number of non-overlapping categories which together include all case. It is then followed by taking cases at random from divided categories, the number from each category being proportionate to the total number therein.  The required number of cases are selected for developing standardized tests or establishing norms, by adopting the above procedure.

Structured Question is one that consists of an introductory statement followed by a series of specific sub-questions which relate to different themes in the main statement.  In this type, the scope of the answer/the area covered is limited by the introductory statement (stem).

Subjective Questions are the questions for which answers are not usually predetermined and hence the objective marking is difficult.  [Also see, Supply Type Questions, Essay Type Questions].

Sub-Test refers to a division of a test designed to measure a particular aspect of behaviour which the test as a whole attempts to measure.

Summative Assessment is the final and overall assessment of the extent to which the objectives of learning are achieved over the period of the entire course. It is usually done at the end of a learning programme for the purpose of providing a statement of result, i.e., to decide whether a candidate has passed or failed.  No remedial measures are usually possible on the basis of this assessment as it conveys only the final result. [Also see, Formative Assessment].

Supply Type Question requires the examinee to supply the answers within a prescribed time by recalling facts, knowledge, specifics, etc., already acquired by him. The form of the answer may be short or long. In this type of questions, a significant degree of freedom is available to choose, arrange and express the information asked for.  Short answer questions, essay questions, problem solving questions, etc., are some of the supply type questions.  These questions are also regarded as subjective type questions, free response questions, etc.

Synthesis is the process of putting together or generalizing or integrating the information, procedures and other such elements that are scattered, so as to for a new structure. This is one of the higher order abilities and the last but one among the six hierarchical levels of the scholastic category classified under the cognitive domain of the TEO.  Write, modify, produce, constitute, etc., are some of the action verbs that would represent synthesis.  Condense questions of ‘precise writing’ type, summary writing, etc., are some of the techniques through which the skill of synthesis could be measured.