62 research outputs found
Script concordance test: an approach to the evaluation of clinical reasoning in uncertain contexts
Little research has been done in Brazilian medical education on the evaluation of clinical reasoning in situations of uncertainty. The most common tests are still multiple-choice, which are capable of evaluating skills when dealing with well-defined problems. However, in practice the majority of situations involve uncertainties. A method for the evaluation of clinical reasoning in contexts of uncertainty was developed on the basis of the cognitive script theory in relation to professional reasoning. The objectives of the research were to develop, apply, and analyze this methodology in a Brazilian educational setting, based on clinical situations in Geriatrics that involved diagnostic, therapeutic, or ethical dilemmas. A group of specialists in this area and a group of undergraduate students that were completing their training in the Geriatrics internship took the test. Comparison of the results led to evidence of the instrument's validity, capable of distinguishing clinical reasoning according to the participants' level of experience. The mean score for the specialists (80,41) was higher than that of students (70,71) (p < 0,001). In addition, analyses of the internal consistency and a G study design furnished results that are consistent with a scoring system that seeks to evaluate a professional skill. In conclusion, a proposal for a script concordance test in the Portuguese language, applied in a Brazilian teaching institution, may be a viable alternative for evaluating clinical reasoning in contexts of uncertainty.A avaliação do raciocínio clínico em situações de incerteza é pouco pesquisada na educação médica. Os testes escritos mais aplicados são de múltipla escolha, capazes de avaliar como se lida com problemas bem definidos. Porém, a maioria das situações contém incertezas. Um método de avaliação do raciocínio clínico em contextos de incerteza foi desenvolvido a partir da teoria de scripts, com situações em geriatria. Um grupo de especialistas e um grupo de estudantes de graduação resolveram o teste. Acomparação entre os resultados trouxe indícios da validade do instrumento, capaz de diferenciar o raciocínio relacionado ao nível de experiência profissional. A média dos escores dos especialistas (80,41) foi superior à dos estudantes (70,71), p < 0,001. As análises de consistência interna e um estudo G forneceram resultados que estão de acordo com metodologias que buscam avaliar uma competência profissional. Concluiu-se que uma proposta de teste de concordância de scripts em língua portuguesa aplicado em uma instituição de ensino brasileira pode ser uma alternativa para a avaliação do raciocínio clínico em contextos de incerteza.Universidade Federal de São Paulo (UNIFESP)UNIFESP, EPM, São Paulo, BrazilSciEL
Using Differential Item Functioning to evaluate potential bias in a high stakes postgraduate knowledge based assessment
BACKGROUND: Fairness is a critical component of defensible assessment. Candidates should perform according to ability without influence from background characteristics such as ethnicity or sex. However, performance differs by candidate background in many assessment environments. Many potential causes of such differences exist, and examinations must be routinely analysed to ensure they do not present inappropriate progression barriers for any candidate group. By analysing the individual questions of an examination through techniques such as Differential Item Functioning (DIF), we can test whether a subset of unfair questions explains group-level differences. Such items can then be revised or removed. METHODS: We used DIF to investigate fairness for 13,694 candidates sitting a major international summative postgraduate examination in internal medicine. We compared (a) ethnically white UK graduates against ethnically non-white UK graduates and (b) male UK graduates against female UK graduates. DIF was used to test 2773 questions across 14 sittings. RESULTS: Across 2773 questions eight (0.29%) showed notable DIF after correcting for multiple comparisons: seven medium effects and one large effect. Blinded analysis of these questions by a panel of clinician assessors identified no plausible explanations for the differences. These questions were removed from the question bank and we present them here to share knowledge of questions with DIF. These questions did not significantly impact the overall performance of the cohort. Group-level differences in performance between the groups we studied in this examination cannot be explained by a subset of unfair questions. CONCLUSIONS: DIF helps explore fairness in assessment at the question level. This is especially important in high-stakes assessment where a small number of unfair questions may adversely impact the passing rates of some groups. However, very few questions exhibited notable DIF so differences in passing rates for the groups we studied cannot be explained by unfairness at the question level
The health disparities cancer collaborative: a case study of practice registry measurement in a quality improvement collaborative
<p>Abstract</p> <p>Background</p> <p>Practice registry measurement provides a foundation for quality improvement, but experiences in practice are not widely reported. One setting where practice registry measurement has been implemented is the Health Resources and Services Administration's Health Disparities Cancer Collaborative (HDCC).</p> <p>Methods</p> <p>Using practice registry data from 16 community health centers participating in the HDCC, we determined the completeness of data for screening, follow-up, and treatment measures. We determined the size of the change in cancer care processes that an aggregation of practices has adequate power to detect. We modeled different ways of presenting before/after changes in cancer screening, including count and proportion data at both the individual health center and aggregate collaborative level.</p> <p>Results</p> <p>All participating health centers reported data for cancer screening, but less than a third reported data regarding timely follow-up. For individual cancers, the aggregate HDCC had adequate power to detect a 2 to 3% change in cancer screening, but only had the power to detect a change of 40% or more in the initiation of treatment. Almost every health center (98%) improved cancer screening based upon count data, while fewer (77%) improved cancer screening based upon proportion data. The aggregate collaborative appeared to increase breast, cervical, and colorectal cancer screening rates by 12%, 15%, and 4%, respectively (p < 0.001 for all before/after comparisons). In subgroup analyses, significant changes were detectable among individual health centers less than one-half of the time because of small numbers of events.</p> <p>Conclusions</p> <p>The aggregate HDCC registries had both adequate reporting rates and power to detect significant changes in cancer screening, but not follow-up care. Different measures provided different answers about improvements in cancer screening; more definitive evaluation would require validation of the registries. Limits to the implementation and interpretation of practice registry measurement in the HDCC highlight challenges and opportunities for local and aggregate quality improvement activities.</p
How well do second-year students learn physical diagnosis? Observational study of an objective structured clinical examination (OSCE)
BACKGROUND: Little is known about using the Objective Structured Clinical Examination (OSCE) in physical diagnosis courses. The purpose of this study was to describe student performance on an OSCE in a physical diagnosis course. METHODS: Cross-sectional study at Harvard Medical School, 1997–1999, for 489 second-year students. RESULTS: Average total OSCE score was 57% (range 39–75%). Among clinical skills, students scored highest on patient interaction (72%), followed by examination technique (65%), abnormality identification (62%), history-taking (60%), patient presentation (60%), physical examination knowledge (47%), and differential diagnosis (40%) (p < .0001). Among 16 OSCE stations, scores ranged from 70% for arthritis to 29% for calf pain (p < .0001). Teaching sites accounted for larger adjusted differences in station scores, up to 28%, than in skill scores (9%) (p < .0001). CONCLUSIONS: Students scored higher on interpersonal and technical skills than on interpretive or integrative skills. Station scores identified specific content that needs improved teaching
Validating a set of tools designed to assess the perceived quality of training of pediatric residency programs
Breast and Tumour Volume Measurements in Breast Cancer Patients Using 3-D Automated Breast Volume Scanner Images
Recommended from our members
Multidimensional DIF analyses: The effects of matching on unidimensional subtest scores
357-36
- …
