Abstract
The femoral neck axis serves as a critical parameter in evaluating hip joint health, particularly in the pediatric population. Commonly used metrics for evaluating femoral torsion, such as the femoral neck-shaft and femoral anteversion angles, rely heavily on precise definitions of the position and orientation of the femoral neck axis. Current measurement methods employing radiographs and performing two-dimensional (2D) measurements on computed tomography (CT) scans are susceptible to errors due to their reliance on reader experience and the inherent limitations in 2D measurements. We hypothesized that utilizing volumetric data would mitigate these errors and enable more accurate and reproducible measurements of the femoral neck axis using the femoral anteversion and femoral neck-shaft angles. To test this hypothesis, we analyzed a historical collection of postmortem infant femoral and pelvic bones (28 hips) aged 0 to 6.5 months, with an average estimated age of 4.68 ± 1.80 months. Our findings revealed an average neck-shaft angle of 128.00 ± 4.92 deg and femoral anteversion angle of 35.56 ± 11.68 deg across all femurs, consistent with literature values. These measurements obtained from volumetric image data were found to be repeatable and reliable compared to conventional methods. Our study suggests that the proposed methodology offers a standardized approach for obtaining repeatable and reproducible measurements, thus potentially enhancing diagnostic accuracy and clinical decision-making in assessing hip developmental conditions in pediatric patients.
1 Introduction
Quantifying the femoral neck axis in pediatric populations is crucial for diagnosing pathological conditions in the hip joint [1], such as developmental dysplasia of the hip (DDH) in infants or cerebral palsy in older children [2,3]. The femoral neck axis is used as a reference for several measurements used to quantify malformations in the proximal femur, which, if left untreated, can be detrimental to growth and development [4]. Two important morphological features that require referencing the femoral neck axis are the femoral neck anteversion and femoral neck-shaft angles. Femoral anteversion (or femoral neck anteversion) refers to the rotation of the femoral neck about the femur's longitudinal axis and describes cases of increased femoral torsion or increased femoral version [5,6]. The femoral anteversion angle (FAV) averages approximately 30–35 deg at birth in otherwise healthy infants and decreases to about 10–20 deg in adulthood [7–10]. However, the reported FAVs measured in pediatric populations with DDH are inconsistent. Some found increased FAVs in infants [11] and children [7] with dysplastic hips, with an average FAV of 55 deg at year one for severe dislocations [9], while others found no significant differences in FAV between healthy and dysplastic hips [12–14]. Similarly, FAVs in children with cerebral palsy are about 10 deg higher than in healthy children [5]. The femoral neck-shaft angle (NSA) averages 130–135 deg at birth in healthy infants, peaks around 145 in years 1–3, and decreases significantly in adolescence and adulthood [4,9].
There are no well-established, gold-standard methods for measuring NSA [15] or the FAV [3,16]. However, both of these metrics are generally obtained using two-dimensional (2D) measurements from radiographs or computed tomography (CT) of one or more transverse image slices for the FAV [6,16] or a single coronal slice for the NSA [4]. These conventional 2D methods may cause some variation in results depending on the patient's positioning during imaging [4]. Due to the lack of established methods, there are variations in the literature on how the femoral neck axis is defined for measuring the NSA and FAV. Zhang et al. [17] proposed a method for describing the three-dimensional (3D) femoral neck axis in adult femurs. The authors defined the femoral neck axis as the line that connects the geometric center of the femoral head and the midpoint of a tangent line to a concentric sphere at the center of the femoral head. They concluded that their methods provided an accurate and repeatable process for determining the femoral neck axis when measuring the femoral neck torsion angle, which the authors emphasize was different from the femoral neck anteversion angle. Souza et al. [1] explored multiple ways of defining the femoral neck axis for measuring the FAV on adult femurs. They found that using the axis oblique section on the 2D CT provided the best approximation of FAVs. Schmaranzer et al. [18] considered four automated 3D methods for measuring the FAV based on the center of the femoral head and various landmark points on the proximal femur. The authors reported results comparable to those of conventional 2D methods but could not make claims regarding which 3D-based method was superior in performance.
While some authors are investigating using 3D data to define the femoral neck axis, most reported FAVs and NSAs are from 2D methods. There is a lack of consensus on an established method for determining the femoral neck axis, which may lead to variability in the reported values. Some authors suggest that 3D methods are more reliable than 2D methods [3], while others find comparable results [18]. Even in healthy populations, the FAV depended on the landmarks identified and the imaging technique used [5]. The FAV and NSA are important anatomical features in the femur for diagnostic purposes and for understanding healthy femoral growth and development [4]. Therefore, this study aimed to investigate the variability of the femoral neck axis in the infant femur from medical images through two measurements: the NSA and the FAV. We hypothesized that volumetric (or 3D) measurements would minimize the errors associated with two-dimensional measurements and allow for more detailed, reproducible measurements.
2 Materials and Methods
The decedents used in this study were from a small historical collection, denoted here as the Ortolani collection. It is comprised of postmortem infant femoral and pelvic bones with estimated ages ranging from 0 to 6.5 months, with an average estimated age of 4.68 ± 1.80 months. The decedents from the Ortolani collection included healthy hips and hips with varying degrees of dysplasia and were obtained from the University of Padua, Italy [19]. The decedents are infants who died of infectious diseases, such as influenza and gastroenteritis, that were common in children during the pre-antibiotic era [12]. Of the 14 total decedents (28 hips) in this collection, we obtained CT (slice thickness: 0.45 mm) of four decedents (eight hips). Volumetric anatomical models were generated from the CT of the femurs using Synopsys®simplewarescanip, a medical image processing software. All generated 3D models were constructed using a semi-automatic segmentation approach called thresholding, which generates a binary segmentation by separating a grayscale image into two regions based on the selected threshold value. A threshold value of −50 Hounsfield Units was selected for all decedents. All measurements in the current study were performed in simplewarescanip by two observers following the same protocol.
2.1 Femoral Neck-Shaft Angle.
As shown in Fig. 1, the femoral neck-shaft angle (NSA) was computed as the angle between the femoral neck and femoral shaft axes. The femoral neck axis was defined by a line connecting the centers of the femoral head (FHC) and the center of the femoral neck (FNC) [20]. The center of the femoral head was defined in 3D using the center of a best-fit sphere, denoting the geometric center of the femoral head. For the femoral neck, a region of interest (ROI) was selected using paint tools in simplewarescanip by “painting” all visible regions of the femoral neck. The geometric center was determined from the center of the sphere completely enclosed by the selected ROI. The ROI was selected as two sections of the femoral shaft, one at a proximal (FS1) point and another at a distal point (FS2). The inner best-fit spheres were computed for each section such that the sphere fit inside each ROI, and the line connecting the center of the two spheres was defined as the femoral shaft axis.

The femoral neck-shaft angle () measurement. (a)Projection of the femur onto the coronal plane, (b) projection onto the sagittal plane, and (c) 3D generated model. The femoral neck axis was defined as the line connecting the femoral head center (FHC—red) with the center of the femoral neck (FNC—blue). The femoral shaft axis was defined as the line connecting a center point on the proximal femoral shaft (FS1) with a center point on the distal femoral shaft (FS2).

The femoral neck-shaft angle () measurement. (a)Projection of the femur onto the coronal plane, (b) projection onto the sagittal plane, and (c) 3D generated model. The femoral neck axis was defined as the line connecting the femoral head center (FHC—red) with the center of the femoral neck (FNC—blue). The femoral shaft axis was defined as the line connecting a center point on the proximal femoral shaft (FS1) with a center point on the distal femoral shaft (FS2).
2.2 Femoral Anteversion Angle.
The femoral anteversion angle was computed as the angle between the femoral neck axis and the posterior condylar axis of the knee (Fig. 2). As described in Fig. 3, the posterior condylar axis was defined by two points denoting the most posterior points on the medial (MC) and lateral condyles (LC) [2,16]. Defining the distal femoral axis using the posterior condylar line was reportedly the least-location dependent and most repeatable method [5]. The femoral neck axis reference line created for the NSA measurement was maintained for the FAV measurements, and only the reference line for the posterior condylar axis was modified. One decedent (two femurs) was excluded from the FAV measurement because the distal femoral condyles were not intact, which are crucial anatomical features for identifying the posterior condylar axis and consequently measuring the FAV.

The femoral anteversion angle () showing (a) a head-on view of the femur (axial plane), (b) the 3D generated model, and (c)an isometric axial view. The femoral neck axis was defined as the line connecting the center of the femoral head with the center of the femoral neck. The posterior condylar axis was defined as the line connecting most posterior distal MC and LC.

The femoral anteversion angle () showing (a) a head-on view of the femur (axial plane), (b) the 3D generated model, and (c)an isometric axial view. The femoral neck axis was defined as the line connecting the center of the femoral head with the center of the femoral neck. The posterior condylar axis was defined as the line connecting most posterior distal MC and LC.

The landmarks used to define the distal femoral axis on the 3D generated models, which was defined using the most posterior distal MC and LC. The 3D models are represented in the (a) posterior isometric view, (b) medial side view (sagittal plane), (c) top-down view of the anterior femur, and (d) inferior axial view (axial plane).

The landmarks used to define the distal femoral axis on the 3D generated models, which was defined using the most posterior distal MC and LC. The 3D models are represented in the (a) posterior isometric view, (b) medial side view (sagittal plane), (c) top-down view of the anterior femur, and (d) inferior axial view (axial plane).
2.3 Statistical Analysis.
The descriptive statistics of the data are presented as the mean value ± standard deviation. The measurements are presented as the mean value calculated by two examiners and as the raw values computed by two examiners. Interclass correlation (ICC) was conducted to determine inter-rater and intra-rater reliability. These were conducted to determine how consistent the two observers were in taking the measurements compared to one another and individually. The inter-rater reliability for NSA and FAV was computed using ICC with a 95% confidence interval (CI) using ibmspssstatistics 27 [21].
3 Results and Discussion
In this study, eight femurs (four left, four right) were used to test the variability in the femoral neck axis. The average estimated age of the decedents was 3.88 ± 2.84 months (range: 0 to 6 months). A total of ten measurements were taken for each femur (20 per decedent) for the NSA and the FAV, except for one decedent that was missing crucial anatomy needed for the FAV. The purpose was to investigate the variability of the femoral neck axis in the infant femur through the NSA and the FAV. The hypothesis was that 3D measurements would allow for repeatable and more reproducible measurements since more information can be utilized in 3D measurements.
The NSA ranged from 120.50 deg to 139.66 deg for the right femurs, averaging 128.55 ± 5.00 deg. The NSA ranged from 118.83 deg to 139.39 deg for the left femurs, with an average of 127.45 ± 4.84 deg. The average NSA for all femurs was 128.00 ± 4.92 deg. A detailed description of the average FAV and NSA values are shown in Table 1. The mean NSA for all decedents was higher for observer 1, while the mean FAV was near exact for both observers. The FAV ranged from 20.35 deg to 47.80 deg for the right femurs, with an average of 32.89 ± 10.39 deg. The FAV ranged from 22.00 deg to 55.87 deg for the left femurs, with an average of 38.23 ± 12.44 deg. The average FAV for all femurs was 35.56 ± 11.68 deg.
Descriptive statistics for FAV and NSA for both observers, separated by left and right sides
Mean ± SD (right) | Mean ± SD (left) | Mean ± SD (all) | |
---|---|---|---|
Observer 1 | |||
NSA | 128.86 ± 5.22 | 128.52 ± 5.29 | 128.69 ± 5.19 |
FAV | 32.57 ± 10.57 | 38.45 ± 12.48 | 35.51 ± 11.75 |
Observer 2 | |||
NSA | 128.24 ± 4.89 | 126.38 ± 4.21 | 127.31 ± 4.60 |
FAV | 33.21 ± 10.57 | 38.00 ± 12.85 | 35.60 ± 11.81 |
All | |||
NSA | 128.55 ± 5.00 | 127.45 ± 4.84 | 128.00 ± 4.92 |
FAV | 32.89 ± 10.39 | 38.23 ± 12.44 | 35.56 ± 11.68 |
Mean ± SD (right) | Mean ± SD (left) | Mean ± SD (all) | |
---|---|---|---|
Observer 1 | |||
NSA | 128.86 ± 5.22 | 128.52 ± 5.29 | 128.69 ± 5.19 |
FAV | 32.57 ± 10.57 | 38.45 ± 12.48 | 35.51 ± 11.75 |
Observer 2 | |||
NSA | 128.24 ± 4.89 | 126.38 ± 4.21 | 127.31 ± 4.60 |
FAV | 33.21 ± 10.57 | 38.00 ± 12.85 | 35.60 ± 11.81 |
All | |||
NSA | 128.55 ± 5.00 | 127.45 ± 4.84 | 128.00 ± 4.92 |
FAV | 32.89 ± 10.39 | 38.23 ± 12.44 | 35.56 ± 11.68 |
The overall inter-rater reliability for all femurs was 0.74 (CI: 0.53–0.85) for NSA and 0.98 (CI: 0.96–0.99) for the FAV. A breakdown of the inter-rater reliability by femur side is provided in Table 2. The overall NSA intra-rater reliability for observer 1 and observer 2 was 0.80 (CI: 0.58–0.95) and 0.70 (CI: 0.42–0.92). However, the FAV intra-rater reliability scores for both observers were high, with 0.98 (CI: 0.92–1.00) for observer 1 and 0.98 (CI: 0.94–1.00) for observer 2.
Inter-rater reliability using ICC by using all five measurements per subject by each observer
95% confidence interval | |||
---|---|---|---|
Inter-rater score | Lower | Upper | |
NSA | |||
Left | 0.66 | 0.27 | 0.85 |
Right | 0.82 | 0.60 | 0.92 |
Overall | 0.74 | 0.53 | 0.85 |
FAV | |||
Left | 0.97 | 0.92 | 0.99 |
Right | 0.99 | 0.98 | 1.00 |
Overall | 0.98 | 0.96 | 0.99 |
95% confidence interval | |||
---|---|---|---|
Inter-rater score | Lower | Upper | |
NSA | |||
Left | 0.66 | 0.27 | 0.85 |
Right | 0.82 | 0.60 | 0.92 |
Overall | 0.74 | 0.53 | 0.85 |
FAV | |||
Left | 0.97 | 0.92 | 0.99 |
Right | 0.99 | 0.98 | 1.00 |
Overall | 0.98 | 0.96 | 0.99 |
Based on the results, the FAV 3D methods were not largely affected by the observer. The mean values were consistent between observers, with a mean difference of 0.90 ± 2.30 deg for all measurements. The mean difference between the left and right FAV measurements between observers was 0.45 ± 3.06 deg and 0.63 ± 1.00 deg. This demonstrates that the chosen 3D-based measurement methods for FAV allow for consistent, repeatable results between decedents and observers. However, for the NSA, the differences between observers were higher. The mean differences were 1.38 ± 3.41 deg for all measurements, with differences of 2.14 ± 3.62 deg and 0.61 ± 3.08 deg between observers on the left and right sides. The NSA findings further demonstrate the difficulty in defining the femoral shaft axis. This reference axis is known to be challenging to define due to the anterior curvature of the femoral shaft [5].
The inter- and intra-rater reliability for FAV was excellent and was not influenced by the number of measurements per decedent for each observer, as demonstrated by measurements of inter-rater (Fig. 4) and intra-rater (Fig. 5) ICCs. However, the NSA inter- and intra-rater reliability seemed slightly affected by the number of measurements performed by each observer (Figs. 4 and 5, respectively). Using individual measures of all NSA measurements had the lowest inter-rater reliability (0.66, IC: 0.29–0.85), whereas using the mean of two or more measurements had a reliability of 0.76 or higher. The intra-rater reliability, however, demonstrated some inconsistency in both observers' measurements for the NSA regardless of the number of measurements taken for each decedent, as shown in Fig. 6, which displays the variation in the global location of the FNC and its relationship to both the observer and the 3D measurements analyzed herein. The findings in Fig. 6 suggest that the 3D landmarks for the FAV were less sensitive to the global location of the FNC compared to the NSA.

Inter-rater reliability by computing ICC of the mean of a varying number of measurements for all decedents by each observer. Upper bound (UB) and lower bound (LB) on a 95% confidence interval.

Intra-rater reliability for both observers by computing ICC of a varying number of measurements for all decedents. Upper bound (UB) and lower bound (LB) on a 95% confidence interval.

The position of the center of the femoral neck (FN) in the global x- (medial–lateral direction), y- (anterior–posterior), and z-positions (inferior–superior) and how it related to the measured FAV and NSA for both observers
Overall, the average NSA and FAV values were 128.26 ± 4.87 deg (range: 118.83–139.66 deg) and 35.56 ± 11.68 deg (range: 20.35–55.87 deg). The 3D NSAs were consistent with 2D-based measurements found in the literature, which were between 115 deg and 140 deg [1,22] for second- and third-trimester fetuses. The 3D NSAs also agree with the average 2D-based NSA at birth (130–135 deg) in healthy infants reported by Hensinger [9]. Park et al. [2] reported average 2D-based NSA and FAV of 133.7 ± 5.4 deg and 35.1 ± 12.1 deg in children with cerebral palsy. Similarly, Haddad et al. [15] reported large ranges of NSA in male and female populations, albeit in adult patients. The authors reported NSA of 119.5 to 145 deg (mean: 132.3 ± 4.4 deg) in males and 116.5 to 145.5 deg (mean: 130.1 ± 4.8 deg) in females. The average 2D-based FAV in the Ortolani collection reported by Huayamave et al. [12] was 35.8 ± 6.5 deg. These measurements were described as manual measurements taken using a goniometer and the photographic method. The methods used in the current study mimic the 2D-based photographic method in the definition of the femoral neck axis and distal femoral axis. The average 3D FAV values were close to those reported in Ref. [12], with a mean difference of 0.20 ± 11.35 deg. Although there were minimal differences between 2D- and 3D-based FAVs in the current study, some discrepancies were reported in literature. Cai et al. [3] found significant differences in FAV measurements between 2D and 3D methods in children with hip disorders. The authors attributed these differences to the differences in accuracy and reliability between 2D and 3D methods. Furthermore, the authors found the 2D-based method (2D CT) less accurate and reliable than the 3D-based method. Although Van fraeyenhove et al. [6] did not comment on the accuracy of 2D methods over 3D methods, the authors found that 2D measurements (Murphy's method [23]) underestimated “true” femoral torsion.
Other studies have performed 3D measurements on the femur from segmented volumetric models. Davis et al. [24] performed 3D measurements of the FAV on adult femurs and reported a mean FAV of 12.7 ± 9.1 deg when using the distal posterior condylar axis as the distal reference line. Schmaranzer et al. [18] reported similar femoral version angles when comparing automated 3D-based and proximal/distal 2D-based methods. After comparing multiple 3D-based methods, the authors found no significant results supporting using one 3D-based method over another. However, the authors did find comparable results for the neck method and the head-shaft method. The neck method defined the femoral neck axis as a best-fit line between the medial and lateral femoral neck. In contrast, the head-neck method was defined as a line connecting the femoral head center to the femoral shaft at the greater trochanter [18], similar to 2D-based method developed by Murphy in 1987 [23].
The variability in the femoral neck axis may be affected by the age of the decedents. Boniforti et al. [25] investigated the variability in pelvic measurements of infants 3 to 36 months and found that the measurements varied the most during early infancy. This is when the entire femoral head and most of the acetabulum consist of cartilage, which is difficult to see on pelvic radiographs. The variability they found was of particular concern when diagnosing and managing DDH in infants. The authors determined that errors associated with various pelvic measurements varied with the age of the infants, suggesting that age-appropriate indicators of DDH should be used during diagnosis. The results of their study suggest that how the measurements are reported may also influence the variability found in literature, in addition to the variability caused by landmarks and imaging techniques. Based on the FAV method employed, measurements can lead to mean differences of up to 10 deg or higher in individuals with increased anteversion angles as a result of the imaging modality or the landmarks used [5]. Due to the variability found in literature, some suggest employing threshold values for the various 2D and 3D FAV measurement approaches [6,26].
The study is limited by the sample size and the absence of a healthy control group. The small sample size is precluded by the small number of available high-resolution CT scans of infant femurs. Therefore, the results of the study should not be used for clinical assessment. However, despite the small sample size, the measurements were consistent with those found in the literature. Additionally, there is a possibility that the NSA and FAV measurements were influenced by how the decedents were (1) dissected and (2) positioned during the imaging. Although Haddad et al. [15] studied adult patients, the authors found a significant difference in NSA measurements between radiographs acquired with patients in the upright and supine positions. However, since the NSA in the current study was performed only in the 3D axis, it may alleviate some of the variability of this measurement that is influenced by the positioning of the patient. Mayr et al. [27] found that while 2D-based methods (CT) required neutral positional of the legs to get accurate measurements, 3D-based methods allowed for femoral torsion measurements independent of positioning. Regardless, more research is needed on the effect of positioning on 2D and 3D NSA measurements in infants to address variability from positioning properly. Finally, it is not possible to conclude on the number of measurements that should be taken for each patient or the number of observers that are needed as this is dependent on several factors, such as the landmarks, imaging techniques, the skill of the observer, and the morphology of the patient. Therefore, the findings should not be used for clinical decision-making.
4 Conclusion
Our study aimed to evaluate infant femoral torsion by exploring the variability of the femoral neck axis through the femoral neck-shaft and anteversion angles using three-dimensional measurements. These angles serve as vital metrics for evaluating hip joint health. By employing a 3D approach, we sought to mitigate some of the variability in femoral neck axis measurements induced by patient positioning. Our findings align with existing literature values and demonstrate the accuracy and reliability of our 3D methodology compared to other conventional methods. Importantly, our approach offers the potential to achieve repeatable and reproducible measurements across different observers. Further research and validation studies are warranted to fully establish the clinical utility and applicability of our approach in diverse patient populations and clinical settings. In summary, our study contributes to advancing the understanding and measurement of femoral neck axis parameters, offering a valuable tool for clinicians and researchers in the fields of orthopedics and pediatric hip health.
Acknowledgment
This works was supported by the International Hip Dysplasia Institute (IHDI).
Funding Data
The U.S. National Science Foundation CAREER Award (No. CMMI-2238859; Funder ID: 10.13039/100000147).
Data Availability Statement
The datasets generated and supporting the findings of this article are obtainable from the corresponding author upon reasonable request.