Radioprotection 60-3

S. Ito et al.: Radioprotection 2025, 60( 3), 211 – 220 215

Table 3. Comparison of Scale Scores for Artificial Intelligence Chatbots and Web Search( Continuous Variables). Artificial Intelligence Chatbot Web Search

Normal level 6th Grade Level ChatGPT-3.5 1 Copilot 2 Gemini 3 ChatGPT-3.5 4 Copilot 5 Gemini 6 Google 7

M SD M SD M SD M SD M SD M SD M SD

PEMAT( scores) 126.0 1 < 7 649.0 340.2 2 < 7 987.8 429.9 3 < 7 378.2 95.7 4 < 7 764.4 553.6 5 < 7 962.3 364.9 6 < 7 3683.6 4885.3 1 – 6 < 7
Understandability 63.8		13.3	1 < 3,5,6 71.2		10.9	7 < 2 < 6	76.5	11.5	1,7 < 3	70.4	8.5	7 < 4 < 6 75.8		12.7	1,7 < 5	84.1	7.4	1,2,4,7 < 6 60.8		17.9	7 < 2 – 6
Actionability	1.3	5.1		0.0	0.0		6.9	16.3	7 < 3	0.7	3.7		1.9	10.8		4.3	13.7		1.4	8.4	7 < 3
jReadability( scores)
Total number 451.0
of characters *
Number of words	28.1	4.1	1 < 7	36.7	16.5	3,4,6 < 2	21.5	10.0	3 < 2,7	22.2	3.3	4 < 2,7	30.3	20.0	6 < 5 < 7 19.5		5.8	6 < 2,5,7	39.1	52.6	1,3 – 6 < 7
per sentence
Readability score 1.7		0.5	1 < 3 – 6	2.1	1.0	7 < 2 < 3,5,6 2.9		1.1	1,2,7 < 3 2.3		0.5	1,7 < 4	2.7	1.2	1,2,7 < 5 2.9		0.9	1,2,7 < 6	1.5	3.0	7 < 2 – 6

ANOVA with a significance level of 5 % was performed, and groups with significant differences were indicated by inequalities.

* Since the evaluation was possible only up to 20,000 characters due to the jReadability limit, we evaluated up to 20,000 characters from the beginning of the text. results of this study of radiation-related text are similar to those of previous studies that evaluated the quality of patient education materials about medical care produced by AI chatbots using PEMAT-P and other methods( Ayoub et al., 2023; Dihan et al., 2024; Musheyev et al., 2024; Pan et al., 2023). However, AI chatbots have the following problems: they cannot present effective diagrams and pictures to explain key concepts, and they produced obvious errors in 48 sentences( 5.9 % of the material produced), which raises questions about the reliability of the information. Although AI chatbots may be a useful tool to help local residents understand radiation disaster prevention, there are issues with the reliability of information and effective ways to present numbers and images. Further research, including the creation of effective prompts, may be needed to provide more understandable, actionable, and readable materials.

This section describes the results of a comparison between documents created by the AI chatbots and web page documents from Google Search results. In terms of radiation-related terms, the documents generated by the AI chatbots were easier to understand than the web-based documents, and the Japanese sentences were also easier to understand. However, for the normal level ChatGPT-3.5, there was no significant difference from the web page documents in the Google Search results. The results of this study differed from those of previous studies. For example, in terms of text difficulty, a study by Shen et al. that used ChatGPT-3.5 and Google Search to answer patients’ questions about medical practice guidelines reported that ChatGPT-3.5 produced text more difficult to understand than the Google Search results( Shen et al., 2024). In terms of comprehensibility, the study by Shen et al. reported no difference between ChatGPT-3.5 and Google sentences( Shen et al., 2024). Also, a study by Ayoub et al. evaluated the general medical knowledge of ChatGPT( M = 68.2 %, SD = 4.4) and Google Search pages( Google: M = 89.4 %, SD = 5.9) and reported that Google results were easier to understand( Ayoub et al., 2023). Regarding the differences in the results of previous studies, there may be differences depending on the topic studied. For medical knowledge, there are many Web sites aimed at the general public. While Web pages created by medical institutions can be difficult to understand due to the highly technical nature of the text, many other Web pages, created by companies for example, are reported to be easy to understand( Ito and Furukawa, 2024). However, with regard to nuclear disaster prevention, many nuclear-related materials for the general public have been published online in Japan since the accident at TEPCO’ s Fukushima Daiichi Nuclear Power Station, but it is reported that the difficulty level of the texts of online nuclear-related materials is high( Ito et al., in press). Therefore, for the present topic, it is possible that web pages found with Google are more difficult to understand and have a greater reading difficulty than that of the text produced by AI chatbots.

Next, we discuss the effect of giving the prompt,“ Please teach me at a 6th grade level.” When this prompt was given, there was no significant change in the AI chatbots’ comprehension of the sentences. However, the addition of the prompt significantly decreased the difficulty level of Japanese for ChatGPT-3.5 and Copilot, and Gemini produced the lowest difficulty level of Japanese with and without the prompt. The finding that the AI chatbots in this study decreased

Radioprotection 60-3 | Page 15