{"id":23675,"date":"2024-03-13T11:57:20","date_gmt":"2024-03-13T10:57:20","guid":{"rendered":"https:\/\/info.gwdg.de\/news\/?p=23675"},"modified":"2024-03-13T11:58:30","modified_gmt":"2024-03-13T10:58:30","slug":"unleashing-chaos-the-power-of-random-testing-in-scientific-software-development","status":"publish","type":"post","link":"https:\/\/info.gwdg.de\/news\/unleashing-chaos-the-power-of-random-testing-in-scientific-software-development\/","title":{"rendered":"Unleashing Chaos: The Power of Random Testing in Scientific Software Development"},"content":{"rendered":"<p dir=\"auto\" data-sourcepos=\"5:1-5:412\">As a domain scientist doing high-performance computing, you might find the need to code yourself. Be it that you need custom scripts for efficient file handling, extending the functionality of pre-existing software, or even coding a larger software project from scratch. In any case, it is not enough just to write the code. Instead, software development also involves testing and documenting the produced code.<\/p>\n<p dir=\"auto\" data-sourcepos=\"7:1-7:201\">In this article, we want to introduce a technique that expands your arsenal of tried and tested methods for software testing. What we will not do is go into the details of software testing in general.<\/p>\n<h2 dir=\"auto\" data-sourcepos=\"9:1-9:26\">Property-based testing<\/h2>\n<p dir=\"auto\" data-sourcepos=\"10:1-10:719\">Contrary to what one might assume based on the name, random testing does not mean manually inputting random variables and hoping for the best. Instead, using a random testing strategy, you will test against a specification, i.e., against properties you believe to hold true. This specification is used as a test oracle and for test case generation. Based on the properties of your function, you define invariants which should hold. Then, you test if this assumption holds by randomly generating a large number of test cases. This specific variety of property testing is implemented in the &#8222;QuickCheck&#8220; software library. Originally written in Haskell, it is now available in many programming languages, including Python.<\/p>\n<h3 dir=\"auto\" data-sourcepos=\"12:1-12:46\"><a id=\"user-content-concrete-example-dna-sequence-alterations\" class=\"anchor\" href=\"#concrete-example-dna-sequence-alterations\" aria-hidden=\"true\"><\/a>Concrete example: DNA sequence alterations<\/h3>\n<p dir=\"auto\" data-sourcepos=\"13:1-13:41\">Ensure to install the required packages:<\/p>\n<p dir=\"auto\" data-sourcepos=\"13:1-13:41\"><code>pip install pytest pytest-quickcheck biopython<br \/>\n<\/code><\/p>\n<p dir=\"auto\" data-sourcepos=\"13:1-13:41\">Let us assume we are writing a Python software that takes a DNA sequence and alters it in such a way that the resulting protein sequence remains unmodified.<\/p>\n<pre class=\"code highlight\" lang=\"plaintext\">import pytest\r\nfrom Bio.Seq import Seq\r\n\r\n\r\ndef alter_dna(seq):\r\n    \"\"\"Synonymously alter a DNA sequence\"\"\"\r\n    # Custom function logic here\r\n    altered_seq = seq\r\n    return altered_seq\r\n\r\n\r\ndef translate(dna):\r\n    \"\"\"Translate DNA into protein\"\"\"\r\n    return str(Seq(dna).translate())\r\n\r\n\r\n@pytest.mark.randomize(seq=str, choices=[\"A\", \"C\", \"G\", \"T\"], ncalls=100)\r\ndef test_altered_seq(seq):\r\n    altered_seq = alter_dna(seq)\r\n    original_protein = translate(seq)\r\n    altered_protein = translate(altered_seq)\r\n    assert original_protein == altered_protein<\/pre>\n<p>&nbsp;<\/p>\n<p>Save the test script to a file named <code data-sourcepos=\"42:39-42:48\">seqtest.py<\/code> and invoke it as follows:<\/p>\n<p><code>pytest seqtest.py<br \/>\n<\/code><\/p>\n<p dir=\"auto\" data-sourcepos=\"46:1-46:506\">This code snippet generates 100 random DNA strings, passes them to a custom function that is expected to introduce synonymous mutations (i.e., altering the DNA sequence in a way that leaves the produced amino acid sequence unmodified), and, finally, checks against the specification: are the protein translations of the original and altered sequence identical? If the translation property is not preserved in any of the generated test cases, the test will fail and print information about the failing case.<\/p>\n<p dir=\"auto\" data-sourcepos=\"48:1-48:289\">This is an overly simplified example, lacking many of the subtleties a real QuickCheck-style test would have. Still, this should give you the idea that testing assumed properties against random input will help in exploring unforeseen edge cases and ultimately lead to better code quality.<\/p>\n<h2 dir=\"auto\" data-sourcepos=\"50:1-50:10\"><a id=\"user-content-summary\" class=\"anchor\" href=\"#summary\" aria-hidden=\"true\"><\/a>Summary<\/h2>\n<p dir=\"auto\" data-sourcepos=\"51:1-51:343\">QuickCheck is a software library that assists in software testing by generating test cases for test suites. It is a way to do property-based testing against assertions about logical properties a function should fulfill using randomly generated input. It can help identify bugs and defects that might not be found through other testing methods.<\/p>\n<h2 dir=\"auto\" data-sourcepos=\"53:1-53:19\"><a id=\"user-content-acknowledgements\" class=\"anchor\" href=\"#acknowledgements\" aria-hidden=\"true\"><\/a>Acknowledgements<\/h2>\n<p dir=\"auto\" data-sourcepos=\"54:1-54:478\">I first learned about QuickTest at the CCC 2019, in the talk <a href=\"https:\/\/fahrplan.events.ccc.de\/congress\/2019\/Fahrplan\/system\/event_attachments\/attachments\/000\/004\/102\/original\/Getting_software_right_with_properties__generated_tests__and_proofs.pdf\" target=\"_blank\" rel=\"nofollow noreferrer noopener\" data-sourcepos=\"54:62-54:321\" class=\"external\">\u201cGetting software right with properties, generated tests, and proofs\u201d<\/a> by Mike Sperber. The concept dates back to a <a href=\"https:\/\/www.cse.chalmers.se\/~rjmh\/QuickCheck\/\" target=\"_blank\" rel=\"nofollow noreferrer noopener\" data-sourcepos=\"54:368-54:428\" class=\"external\">Haskell tool<\/a> written by Koen Claessen and John Huges in 1999.<\/p>\n<h2 dir=\"auto\" data-sourcepos=\"54:1-54:478\">Author<\/h2>\n<p><a href=\"mailto:stefanie.muehlhausen@gwdg.de\"><strong>Stefanie M\u00fchlhausen<\/strong><\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>As a domain scientist doing high-performance computing, you might find the need to code yourself. Be it that you need custom scripts for efficient file handling, extending the functionality of pre-existing software, or even coding a larger software project from scratch. In any case, it is not enough just to write the code. Instead, software &#8230; <a title=\"Unleashing Chaos: The Power of Random Testing in Scientific Software Development\" class=\"read-more\" href=\"https:\/\/info.gwdg.de\/news\/unleashing-chaos-the-power-of-random-testing-in-scientific-software-development\/\" aria-label=\"Mehr Informationen \u00fcber Unleashing Chaos: The Power of Random Testing in Scientific Software Development\">Weiterlesen<\/a><\/p>\n","protected":false},"author":166,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[136],"tags":[],"class_list":["post-23675","post","type-post","status-publish","format-standard","hentry","category-science-domains"],"_links":{"self":[{"href":"https:\/\/info.gwdg.de\/news\/wp-json\/wp\/v2\/posts\/23675","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/info.gwdg.de\/news\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/info.gwdg.de\/news\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/info.gwdg.de\/news\/wp-json\/wp\/v2\/users\/166"}],"replies":[{"embeddable":true,"href":"https:\/\/info.gwdg.de\/news\/wp-json\/wp\/v2\/comments?post=23675"}],"version-history":[{"count":2,"href":"https:\/\/info.gwdg.de\/news\/wp-json\/wp\/v2\/posts\/23675\/revisions"}],"predecessor-version":[{"id":23677,"href":"https:\/\/info.gwdg.de\/news\/wp-json\/wp\/v2\/posts\/23675\/revisions\/23677"}],"wp:attachment":[{"href":"https:\/\/info.gwdg.de\/news\/wp-json\/wp\/v2\/media?parent=23675"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/info.gwdg.de\/news\/wp-json\/wp\/v2\/categories?post=23675"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/info.gwdg.de\/news\/wp-json\/wp\/v2\/tags?post=23675"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}