Performance of the BS detector
The developed CNN-based BS detector in this study achieved an accuracy of 91.06% and a well-balanced sensitivity and specificity. All four subtypes of BS were successfully extracted by the detector.
Almost all samples of SB and HS subtypes were correctly predicted. The SB is a simple pulse, usually with a clear peak showing up in the spectrogram and no other SB present within 100 ms on either side. HS are whistling-like sounds and have one to a dozen frequency components in the spectrogram. The highest harmonic frequency of a HS recorded in the study was up to 4000 Hz. These characteristics make the spectrograms of these two types of BS spiky, and therefore are easily distinguishable from spectrograms of noise segments, which are usually flat. Therefore, it is not surprising to see the high detection accuracy for these two subtypes of BS. HS may be indicative of an obstruction, making this finding of particular clinical value.
We also demonstrated a high detection rate for MB (93.87%). An MB can be described as a cluster of SB. Each burst in a MB looks quite similar, while the amplitude of the pulse and the frequency bandwidth might have some slight differences. There are clear silent gaps between the adjacent bursts, and the length of these silent gaps are also inconsistent. The slight decrease of the accuracy is possible due to the existence of these silent gaps.
The CRS samples had the lowest accuracy, but this was still around 80%. CRS is a continuous random sound, and has everything clustered together, without clear rhythm or pattern. The lack of clear pattern gives it more variability, and also makes it more difficult to be recognized. However, we believe that the accuracy of detecting CRS could increase with a more comprehensive dataset of CRS samples.
Our detector has some advantages when compared to other methods developed to extract BS. Zhao K et al. [17, 18] developed CNN-based BS recognizers, and found similar accuracy to our study (above 90%) However, their models were designed to be low computational complexity solutions, and therefore were restricted to identify if a bowel sound had appeared in a segment of 1 s duration. They were successful in their aim. However, unlike our detection they do not provide a time stamp for the BS.
Liu J., et al.  developed an LSTM-based BS detector, which is able to determine the beginning and ending points of every bowel sound by segmenting the recording and detecting BS on each segment. Although they reported a high accuracy and sensitivity on the test set at above 90%, when they tested the model under ‘the real use’ condition, the sensitivity dropped to 62%.
Horiyama et al.  has recently reported an ANN-based BS detector, with improved power-normalized cepstral coefficients. Their model achieved good accuracy under both quiet and noisy environments. The unbalanced dataset used in their work, however may have led to potential problems, because they also reported a low positive predictive value, at around 60% ± 20%.
With our proposed detector we achieved a similar accuracy performance to other researchers, but with both high sensitivity and specificity. Further, our comparatively large study demonstrates the good generalizability of our detector. The method also facilitates the ability to give an accurate start and end point of each BS (see methods section for detail), which is important for calculating the BS features, and for BS feature based analysis. In addition, the accuracy of our model in detecting bowel sound types SB, MB and HS was excellent. The model could be further improved with a more comprehensive dataset for CRS samples. The extra information provided by BS features or the occurrence of number of specific BS subtypes may provide additional clinical insight.
Effect of food intake
We found a significant increase in total BS duration after food consumption. Similar results have been reported previously and this consistency further validates our approach. Du et al.  showed that the BS density and summed amplitude were significantly higher after food consumption. Recently, Horiyama et al.  also reported that the density and length of BS significantly increased after coffee and soda intake.
The frequency domain BS features, SC and SBW, show different results. The spectral bandwidth shows significant difference between the two recording periods while the spectral centroid shows significance only at RLQ. The spectral centroid is the “centre of mass” of the spectrum, and the spectral bandwidth is defined as the frequency range in which the amplitude is not less than half its maximum value. The results suggest that, the food intake would shift the main frequency component for BS from RLQ, and reduce the energy spread of the frequency spectrum for BS recorded from both RLQ and LUQ.
The MCR is the number of times the waveform crosses its mean value. Previous research by our research group, as documented in Fig. 3b in Du et al. , shows that the distributions of MCR have the most significant difference between BS subtypes. This is because that the non-BS intervals, similar to the unvoiced periods, usually have larger MCR values than BS, which are similar to voiced periods . And the interval time between bursts within a BS is one of the main differences between BS subtypes. Therefore, the very small difference in MCR value between the two periods indicates that food intake likely does not have much influence on the proportion of subtypes among all generated BS.
Moreover, generally the significance level is higher for the BS features from the RLQ. This could be explained by the sensor placement. The sensor located on the RLQ of the participants, is placed near the ileocecal valve, which controls the flow of digested food passing from the small intestine into the large intestine. Therefore, the motion of the valve is largely affected by the food intake, and is reflected in the changes of acoustic features of the generated BS.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.